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ABSTRACT 


Many  programs  aimed  at  airborne  mine  and  minefield  detection  are  being  pursued 
and  different  algorithms  are  being  developed  and  evaluated  to  achieve  perfonnance 
specifications.  Thus  far,  no  single  algorithm  or  detection  architecture  has  been  able  to 
fulfill  the  performance  specifications  for  different  mine  and  minefield  detection 
scenarios.  The  reasons  for  this  are  numerous.  The  environment  and  the  operating 
conditions  under  which  an  airborne  sensor  is  expected  to  perfonn  are  highly  varied.  Also, 
the  perfonnance  of  airborne  sensors  and  algorithms  is  highly  dependent  on  the  type  of 
targets  and  environments.  Research  has  been  aimed  to  make  the  algorithms  more  robust 
under  these  varying  conditions,  but  the  studies  have  been  only  partially  successful.  A 
large  amount  of  data  needs  to  be  collected  and  analyzed  to  gain  insight  into  detection 
algorithms  and  their  performance  under  different  operating  conditions.  Data  collection  on 
this  scale  is  time  consuming,  and  costly.  Due  to  this  reason,  a  need  exist  for  a  simulation- 
based  approach.  One  such  simulation  system  is  developed  and  evaluated  in  this  thesis. 
The  factors  affecting  the  perfonnance  of  an  airborne  detection  system  include  physical 
parameters  (type  of  background,  time  of  day),  data  collection  parameters  (swath  width, 
number  of  steps,  in-step  and  in-flight  overlap),  and  minefield  scenarios.  Data  collection 
parameters  are  included  in  the  simulation  tool.  False  alarms  and  mine  statistics  are 
modeled  based  on  the  available  data  collected  as  a  part  of  the  developmental  programs. 
Various  mine  and  minefield  detection  algorithms  are  modeled  and  evaluated.  Simulations 
are  run,  and  Receiver  Operating  Characteristic  (ROC)  curves  are  used  to  evaluate  the 
performance  at  both  the  mine  and  minefield  levels.  Analytical  models  for  minefield 
detection  performance  are  formulated  and  used  to  validate  the  simulated  performance. 
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1.  INTRODUCTION 


Solutions  to  landmine  and  minefield  detections  have  been  sought  for  more  than  75 
years.  A  number  of  different  sensors,  algorithms  and  technologies  have  been  proposed 
and  are  now  being  pursued  for  landmine  detection.  All  mine  detection  technologies  can 
be  classified  into  two  major  categories:  ground-based  mine  detection  and  airborne 
minefield  detection.  Ground-based  technologies  consist  of  handheld  and  vehicular  mine 
detection  in  which  either  a  human  or  a  vehicle  carrying  a  mine  detection  device  scans  the 
neighborhood  for  mines.  However,  these  techniques  suffer  with  a  number  of  obvious 
drawbacks.  First  of  all,  due  to  the  presence  of  humans;  the  risk  of  fatal  errors  always 
exists.  It  is  estimated  that  for  every  2000  mines  cleared,  a  fatal  human  error  occurs 
[Ghaffari  et  ah,  2004].  Second,  ground-based  methods  are  slow  because  one  person 
clearing  mines  by  hand  can  clear  up  to  20  to  50  square  meters  per  day,  and  a  vehicle  can 
cover  about  15,000  square  meters  per  day,  which  is  often  considered  to  be  very  slow 
[Dincerler,  1995].  Third,  mines  are  available  in  different  materials  as  well  as,  different 
sizes  and  shapes.  This  imposes  limitations  in  the  equipment  and  technologies  used  for  the 
landmine  detection. 

Because  of  the  fatal  errors,  slow  clearing  of  mines  of  ground  based  minefield 
detection  system,  and  for  tactical  reasons  in  counter  mine  operations,  airborne 
mine/minefield  detection  using  unmanned  airborne  vehicles  (UAV)  have  gained 
popularity  in  recent  years.  Some  of  the  recent  airborne  minefield  detection  programs  in 
this  research  area  include  the  Airborne  Far  IR  Minefield  Imaging  System  (AFIRMIS) 
[Simrad  and  Mathieu,  1998],  Remote  Minefield  Detection  System  (REMIDS)  [1999], 
Cobra  Reconnaissance  and  Analysis  System  (COBRA)  [Witherspoon  et  ah,  1995],  and 
Lightweight  Airborne  Multi-spectral  Minefield  Detection  System  [LAMD]  [Haskett  and 
Reago,  2001].  Visual,  near-infrared,  and  mid-wave  infrared  images  of  a  minefield  taken 
from  an  airborne  platform  are  processed  at  a  ground  station  to  determine  the  likely 
presence  of  minefields  and  other  obstacles.  The  data  are  collected  using  a  sensor  mounted 
on  a  gimbal  in  either  a  push  broom  or  step  stair  manner.  The  detection  process  typically 
follows  a  sequential  paradigm  based  on  the  detection  of  mine-like  anomalies  followed  by 
the  detection  of  minefield-like  patterns. 
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A  huge  amount  of  data  is  being  collected  and  analyzed  to  gain  insight  into  the 
actual  mine  and  minefield  detection  algorithms  and  their  perfonnance.  Various  anomaly 
detections  algorithms  such  as  RX  [Reed  and  Yu,  1990],  radial  anomaly  detector  [Menon 
and  Agarwal,  2003],  cluster  based  anomaly  detection  [Carlotto,  2005],  signal  subspace 
processing  [Ranney,  2006],  support  vector  data  description  [Banerjee  et  ah,  2006],  and 
false  alarm  reduction  methods  such  as,  gray-scale  moments  [Sriram  et  ah,  2002], 
circularity  [Menon  et  ah,  2004],  and  reflection  symmetry  [Menon  et  ah,  2004;  Kiryati  and 
Gofman,  1996]  have  been  used  to  identify  mine  like  targets.  Various  minefield  detection 
algorithms  have  been  similarly  proposed  such  as  the  empty  boxes  test  algorithm  [Lake  et 
ah,  1997],  linear  pattern  detection  [Malloy,  2003;  Muise  and  Smith,  1995],  Hough  line 
transfonns  [Carlson  et  ah,  1994],  Scatter  Number  [Earp,  2000b]  and  Scatter  Log 
weighted  [Earp  et  ah,  1995], 

Apart  from  the  algorithm  choices,  the  perfonnance  of  the  system  depends  on 
various  other  factors,  including  minefield  scenarios  (such  as  scattered  or  patterned),  and 
data  collection  parameters  such  as  swath  width,  side  step  overlap,  and  the  size  of  the  field 
of  regard  (FoR).  Mine  and  minefield  detection  algorithms  need  to  be  evaluated  for 
different  backgrounds  like  arid  and  temperate.  To  evaluate  the  perfonnance  of  the  sensors 
and  detection  algorithms  for  different  scenarios  and  algorithm  choices,  an  enormous 
amount  of  data  collection  is  required.  This  collection  and  subsequent  analysis  of  data  is 
one  of  the  most  expensive  aspects  in  the  process  of  system  development  and  evaluation. 
Moreover,  it  is  impossible  to  collect  the  data  for  all  possible  variations  in  mine  and 
minefield  detection  scenarios  and  other  data  collection  parameters.  However,  it  is  quite 
possible  to  generate  reasonably  accurate  simulation  data  under  different  sensors, 
minefield  layout  scenarios,  and  mission-specific  constraints.  The  simulated  data  can 
subsequently  be  used  to  evaluate  the  performance  of  different  choices  of  algorithms  and 
data  collection  models  under  different  scenarios.  This  thesis  describes  the  design  of  a 
simulation  tool  to  evaluate  mine  and  minefield  level  perfonnances  based  on  simulated 
data  under  different  sensors  and  mission  profile  parameters.  Mine  and  minefield  level 
performances  obtained  using  simulated  data  are  verified  using  analytical  results  and 
available  data. 
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This  thesis  is  organized  as  follows.  Section  2  starts  with  the  basic  airborne 
minefield  detection  system  and  explains  its  various  blocks  with  the  help  of  actual  data. 
Section  3  explains  the  simulation  system,  which  is  developed  to  simulate  the  minefield 
detection  system  and  also  describe  the  flexibility  incorporated  in  the  simulated  systems. 
This  system  evaluates  the  performance  for  a  particular  choice  of  data  collection 
parameters,  minefield  scenarios,  and  mine  and  minefield  detection  and  thresholding 
algorithm. 

Section  4  presents  a  detailed  explanation  of  the  RX  anomaly  detector.  The  RX 
detection  values  are  then  modeled  using  standard  probabilistic  models.  The  modeling 
results  for  different  times  of  day  and,  different  background  types  are  provided.  The 
performance  of  these  models  for  different  datasets  is  tested  using  the  chi  square  test. 

Section  5  deals  with  spatial  distributions  that  are  used  to  model  the  spatial 
locations  of  the  false  alarms.  Complete  spatial  random  processes,  their  tests  and 
generation  methods  are  discussed  in  some  detail.  Preliminary  results  for  spatial 
distribution  of  mines  and  false  alanns  are  also  presented. 

Section  6  deals  with  analytical  models  for  different  minefields  detection  methods. 
It  also  explains  the  choice  of  minefield  detection  algorithms  used  for  detecting  the 
presence  of  a  minefield.  Minefield  scoring  techniques  are  also  explained  in  this  section. 
Section  7  shows  simulation  results  for  a  particular  set  of  parameters.  Simulation  results 
are  generated  showing  the  effect  of  various  parameters  including  swath  width,  signal  to 
clutter  ratio,  constant  false  alann  rate,  and  segment  overlap.  This  thesis  concludes  with 
Section  8,  which  discusses  the  general  conclusions  and  future  work  in  this  area. 

Some  of  the  supporting  concepts  and  derivations  are  included  in  four  Appendices. 
Appendix  A  lists  various  spectral  vegetation  indices.  It  also  shows  the  capability  of  these 
indices  to  differentiate  between  vegetation  and  non-vegetation  (rock,  soil,  etc.).  Appendix 
B  discusses  the  EM  algorithm  and  its  mathematical  formulation.  It  also  derives  the 
formulation  of  the  update  equation  and  finally  parameter  estimation  for  RX  statistics. 
Convergence  properties  of  the  EM  algorithm  are  also  discussed.  Appendix  C  explores  the 
method  of  moments  to  estimate  the  initial  parameters  for  RX  distribution  to  be  used  for 
the  EM  estimation.  Appendix  D  discusses  different  tests  to  evaluate  the  goodness  of  fit 
for  the  probability  model  obtained. 
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2.  PROBLEM  DESCRIPTION 

This  section  describes  the  basic  airborne  minefield  detection  system  and  the  role 
different  parameters  play  in  determining  detection  performance.  A  typical  airborne 
minefield  detection  system  can  be  described  in  the  form  of  the  block  diagram  in  Figure 
2.1.  It  consists  of  three  different  stages,  each  represented  by  a  single  block.  The  first  stage 
involves  data  collection,  which  deals  with  various  factors  such  as  platform  data,  sensor 
data,  minefield  layout,  and  background  data.  Once  the  data  are  collected,  they  are 
processed  in  the  mine  level  detection  block  to  provide  a  list  of  anomalies,  which  are 
different  for  different  backgrounds  and  depend  on  the  anomaly  detection  algorithm  used 
for  the  processing.  The  anomaly  values  are  then  thresholded  and  passed  to  minefield  level 
detection  block  where  the  thresholded  anomaly  values  are  processed  along  with  their 
spatial  locations  to  provide  a  confidence  metric  pertaining  to  the  presence  of  a  minefield. 
Minefield  level  scoring  is  used  to  evaluate  the  performance  of  the  airborne  minefield 
system  in  terms  of  Receiver  Operating  Characteristic  (ROC)  curves. 
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Figure  2.1.  Block  Diagram  for  a  Typical  Airborne  Minefield  Detection  System 


Different  data  collection  scenarios,  as  well  as  mine  detection  and  minefield 
detection  algorithms,  are  the  main  drivers  of  the  performance  of  this  airborne  minefield 
detection  system.  Each  of  these  blocks  is  discussed  briefly  in  the  following  section  and 
elaborated  further  in  later  sections. 
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2.1.  DATA  COLLECTION  SCENARIOS 

The  airborne  data  are  collected  in  the  form  of  a  sequence  of  image  frames 
captured  as  the  sensor  is  flown  over  the  minefield.  In  a  typical  system,  the  airborne 
sensor  is  flown  over  the  minefield  area  at  a  predefined  altitude  and  speed,  with  a  gimbal 
to  collect  frames  of  images  in  a  specified  pattern.  The  image  data  collected  from  one 
flight  are  called  a  run.  A  specified  number  of  frames  create  a  segment/field  of  regard 
(FoR),  and  a  set  of  segments  constitutes  a  run.  The  geo-locations  of  each  frame  along 
with  other  information  constitute  Meta  data  (data  of  data)  collected  using  onboard  GPS 
and  IMU.  These  Meta  data  and  any  available  image  overlap  are  used  to  reconstruct  the 
ground  image  for  the  FoR.  Minefield  decision  is  based  on  the  detection  statistics 
calculated  over  this  FoR.  Various  factors  of  data  collections  that  may  affect  the  minefield 
detection  performance  are  discussed  below. 

2.1.1.  Background  Data.  Background  data  play  an  important  role  in  performance 
evaluation.  Background  refers  to  the  ground  terrain  in  which  mines  are  being  laid.  The 
background  may  be  vegetation,  soil,  road,  or  a  combination  of  two  or  more  of  these  areas. 
Vegetation  can  be  either  thick  forest  or  tall  grass,  and  soil  can  be  either  rough  clay  or 
smooth  desert  terrain  among  others.  Apart  from  natural  background  there  may  be 
cultured  sources  of  clutter  in  the  background  like  soft  drink  cans  among  others.  In  order 
to  capture  the  impact  of  these  on  the  detection  performance,  both  natural  and  cultured 
sources  of  clutter  should  be  modeled  appropriately. 

2.1.2.  Minefield  Layout.  Mines  are  distributed  in  a  predefined  pattern  that 
includes  both  patterned  and  scattered  distribution.  Each  minefield  will  exhibit  different 
performance  due  to  different  types  of  mines  (metal,  plastic,  etc.),  different  sizes  of  mines, 
and  different  types  of  spatial  distributions  of  mines.  These  layouts  of  the  minefields  are 
dictated  primarily  by  the  mechanism  by  which  mines  are  laid  in  the  minefield  and  tactical 
scenarios.  The  mines  can  be  laid  manually,  by  ground  vehicle,  or  by  helicopter.  The 
mines  can  be  surface  laid  or  buried  and  can  be  of  different  sizes  varying  from  small  to 
large.  Mines  can  also  be  classified  based  on  the  composition,  i.e.,  they  can  be  made  up  of 
either  metal  or  plastic.  Figure  2.2  shows  the  distribution  of  mines.  Figure  2.2(a)  shows 
typical  distribution  of  mines  for  the  case  of  scattered  and  Figure  2.2(b)  shows  typical 
distribution  of  mines  for  patterned  minefields.  Detection  of  mines  depends  greatly  on  the 
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type  of  mine  (plastic  or  metal),  size  (large,  medium  or  small),  and  color  of  mines  (tan, 
green).  Apart  from  these  factors,  number  of  other  factors  such  as  background  terrain,  time 
of  day  also  influences  the  performance. 
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(b)  Patterned  Minefield 


Figure  2.2.  Typical  Distribution  of  Mines  in  Scattered  and  Patterned  Minefield 


2.1.3.  Platform  Data.  Platform  data  comprises  a  number  of  parameters  such  as 
the  altitude  at  which  the  UAV  is  flying,  flight  speed,  flight  angle,  and  start  position  of  the 
UAV.  The  flight  altitude  decides  the  ground  sample  distance  (GSD)  along  with  other 
parameters  for  the  images,  whereas  the  flight  speed  determines  the  frame  rate  for  the 
image  data  to  be  collected  and  the  in-flight  overlap  between  the  frames.  In  the  real 
environment,  these  platform  parameters  are  not  constant  and  change  across  the  frames 
and  segments  due  to  various  reasons.  Various  distributions  can  be  used  to  model  these 
parameters.  The  variability  of  the  start  position  and  the  start  angle  relate  to  how  the  run 
encounters  the  minefield  front. 

The  flight  speed  and  a  number  of  other  factors  including  the  wind  angle,  roll, 
yaw,  and  pitch  of  the  vehicle  are  also  not  constant  throughout  the  whole  run.  Even  the 
altitude  of  the  flight  changes  over  the  run  due  to  changes  in  the  terrain  relief,  which 
eventually  change  the  image  resolution  for  a  particular  frame.  Sensor  parameters  such  as 
side-step  overlap  also  vary  due  to  wobbling  in  the  platform  and  gimbal  pointing  error. 
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Figure  2.3  shows  variations  in  the  flight  altitude,  flight  speed,  and  heading  angle  for  four 
consecutive  segments  (84  frames)  corresponding  to  the  same  run  of  the  airborne  data. 
Table  2.1  shows  various  parameters  and  corresponding  statistics  for  a  particular  run. 


Heading 


Figure  2.3.  Variation  in  Flight  Altitude,  Speed,  and  Heading  Angle  for  Four  Segments 

from  Airborne  Data 


2.1.4.  Sensor  Data.  Sensor  data  deal  with  the  parameters  of  the  sensor  used  for 
the  collection  of  data.  For  current  purposes,  the  sensors  are  basically  imaging  sensors  that 
operate  over  various  frequency  bands  of  the  EM  spectrum.  Many  sensors  operate  in  the 
mid-wave  infra  red  (MWIR)  frequency  range  because  this  frequency  band  is  sensitive  to 
the  thermal  contrast  as  well  as  reflectance  and  has  desirable  spatial  resolution.  Significant 
thermal  contrast  can  be  found  between  metallic  and  non-metallic  mines  and  typical 
backgrounds  for  surface  as  well  as  buried  mines,  making  the  imaging  more  convenient 
than  other  modalities.  The  MWIR  provides  useful  data  during  both  day  time  and  night 
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time  imaging.  Multi  spectral  imaging  (MSI)  sensors  in  visual  and  near  infrared  are  also 
used  for  the  image  capture.  These  sensors  facilitate  the  dual  purpose  of  creating  a  colored 
image  from  the  individual  bands  as  well  as  showing  features  in  visual  and  near  infrared 
bands  that  may  not  be  prominent  in  any  one  of  the  single  band. 


Table  2.1.  Various  Parameters  and  Segment  Information  for  a  Particular  Run 


Date 

8-May-05 

Time 

14.51.12 

Afternoon  Time  Data 

No.  of  IR  segments 

57 

No.  of  MSI  segments 

57 

Mean 

Standard  Deviation 

Units 

Altitude 

2073.00 

149.59 

feet 

Flight  Speed 

72.77 

3.29 

knots 

Segment  Width 

59.95 

9.40 

meters 

Segment  Length 

172.48 

10.74 

meters 

Figure  2.4  shows  an  example  of  the  tiled  image  for  the  daytime  MWIR  image 
segment.  Figure  2.5  shows  a  single  frame  for  red,  green,  blue,  and  NIR  bands  of  the  MSI 
sensor.  The  MSI  sensor,  however,  is  unavailable  at  night  time  due  to  the  lack  of 
illumination.  Figure  2.6  shows  the  tiled  image  for  the  night  time  MWIR  image  segment. 
For  the  comparison  sake,  the  area  covered  in  night  time  and  day  time  IR  segment  is  the 
same.  Because  MWIR  data  shows  a  mix  of  reflection  and  thennal  signature  in  contrast  to 
MSI  data,  which  is  primary  based  on  reflected  data,  it  is  useful  even  in  the  night  time. 

2.1,5.  Reconstruction/Registration.  The  individual  overlapping  image  frames 
for  the  FoR  obtained  using  the  gimbal  needs  to  be  stitched  together  to  create  mosaic 
image  of  the  FoR.  This  method  is  called  registration.  Figures  2.7  and  2.8  show  an 
example  for  the  daytime  MWIR  and  MSI  image  segments,  respectively,  for  the  same 
segment  whose  tiled  image  is  shown  in  Figure  2.4.  The  colored  image  in  Figure  2.7  and 
elsewhere  are  generated  using  the  red,  green,  and  blue  bands.  Figure  2.9  shows  the 
corresponding  night  time  data  for  MWIR.  One  of  the  same  terrain  features  is  indicated  by 
a  red  circle  in  all  three  images  for  easy  comparison. 


(c)  Frame  for  a  Blue  Band 


(d)  Frame  for  a  NIR  Band 


Figure  2.5.  Red,  Green,  Blue,  and  NIR  Frames  for  an  MSI  Image  Segment 
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Figure  2.6.  Tiled  Image  Segment  for  Night  Time  MWIR  Data 


Figure  2.7.  Afternoon  Time  MSI  Registered  Image  Segment 


The  registration  is  possible  due  to  the  availability  of  Meta  data  as  well  as  in-step 
and  in-flight  overlap  between  image  frames.  Due  to  the  computation  complexity  of  image 
based  registration,  it  is  desirable  if  part  of  processing  can  be  done  with  registration  based 
only  on  the  geolocation  in  Metadata.  However,  the  registration/reconstruction  error  will 
be  worse  in  those  cases  when  the  overlap  between  image  frames  is  not  enough  or  cannot 
be  used  due  to  computational  costs.  The  resulting  larger  registration  error  has  bearing  on 
minefield  performance  especially  for  patterned  minefields  since  the  detected  mines  may 
loose  their  linearity. 
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Figure  2.8.  Afternoon  Time  MWIR  Registered  Image  Segment 


Figure  2.9.  Night  Time  Segment  for  MWIR  Data 


Mine  level  detection  of  the  mine  like  targets,  can  either  be  performed  at  the  frame 
level  or  at  segment  level.  However,  some  prejudice  might  be  associated  at  the  frame  level 
calculation  due  to  duplication  of  the  false  alarms  in  adjacent  (both  in-step  and  in-swath) 
frames  resulting  in  biased  performance  because  contribution  from  one  false  alarm  may  be 
recorded  multiple  times.  Because  of  this  reason,  good  registration  is  important  in  order  to 
accurately  transform  these  individual  frames  in  a  single  coordinate  system.  Thus 
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reconstruction  is  also  the  method  by  which  the  anomaly  detections  of  individual  frames 
are  resolved  to  identify  unique  detections  in  FoR. 

2.2.  MINE  DETECTION 

The  purpose  of  mine  detection  block  is  to  detect  the  location  of  potential  mines 
once  the  anomaly  values  are  assigned  to  each  pixel.  Mine  detection  step  is  intermediate 
step  before  the  minefield  detection  that  provides  the  infonnation  about  the  detection  of 
mine  like  targets  along  with  certain  inevitable  false  alarms.  Mine  detection  depends  on  a 
slew  of  factors  including  the  type  of  sensor(s),  target  signature  (signal  to  clutter  ratio), 
size  of  the  target,  resolution  of  the  sensor,  type  of  the  background,  and  the  algorithm 
(anomaly  detection,  false  alarm  mitigation  methods)  used. 

In  the  proposed  simulation  system,  only  a  set  of  potential  mine  locations  and 
potential  false  alarm  locations  are  simulated.  Thus  it  is  important  to  assign  an  appropriate 
anomaly  detection  statistics  to  the  mine  and  false  alarm  points.  Different  models  for  the 
distribution  of  anomaly  detection  statistics  can  be  used  which  depend  on  the  detection 
algorithm.  One  of  the  popular  algorithms  used  for  anomaly  detection  is  RX  [Reed  and 
Yu,  1990].  The  corresponding  detection  statistics  under  the  white  Gaussian  background 
follow  the  central  F-distribution.  However,  the  distribution  of  RX  detection  statistics  can 
be  quite  different  for  a  more  general  background  due  to  cultured  clutter  and  spatial 
correlation  in  the  background  data  that  influences  the  system  performance.  One  of  the 
major  reasons  for  deviation  from  the  ideal  behavior  is  the  assumption  of  a  white  Gaussian 
background,  that  is  not  always  true  due  to  the  presence  of  different  classes  of  terrains 
[Stein  et  ah,  2002],  and  the  presence  of  multi  resolution  feature  spaces  [Noiboar  and 
Cohen,  2007].  A  number  of  other  distributions  for  anomaly  detectors  are  proposed  based 
on  different  features  which  are  discussed  in  Section  4.5. 

The  performance  of  the  sensor  at  the  mine  level  detection  is  parameterized  in 
tenns  of  the  probability  of  detection  (PD)  and  corresponding  false  alarm  rate  (FAR) 
(false  alarms  per  m  ).  A  mine  level  ROC  (Receiver  Operating  Characteristic)  curve  is 
drawn  showing  the  mine  level  performance  in  terms  of  PD  and  FAR.  The  issue  in  this 
case  of  mine  detection  performance  modeling  is  to  develop  a  model  for  the  selected  mine 
detection  algorithm  to  assign  proper  test  statistics  to  mine  and  false  alann  detections  and 
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estimate  mine  level  performance  (in  terms  of  PD  and  FAR)  for  a  given  sensor,  target,  and 
background  under  reasonable  assumptions. 

2.3.  THRESHOLDING  AND  TARGET  SELECTION 

The  anomaly  detection  algorithm  assigns  a  confidence  value  or  some  other  metric 
to  a  point  being  a  likely  mine.  The  next  step  is  to  threshold  the  list  of  these  locations  to 
obtain  the  set  of  valid  anomalies  that  are  statistically  distinct  from  the  neighboring 
background  and  highly  prospective  mines.  These  sets  of  anomalies  are  then  passed  for  the 
minefield  level  processing.  Another  reason  for  the  perfonning  thresholding  operation  is 
to  effectively  reduce  the  number  of  targets  that  will  be  passed  for  the  minefield  detection. 
Thresholding  algorithms  used  for  the  selection  of  targets  affect  the  minefield  detection 
performance.  Three  different  thresholding  schemes  are  discussed  in  section  3.3.  False 
alarm  mitigation  (FAM)  techniques  can  also  be  applied  before  passing  the  thresholded 
targets  for  the  minefield  level  processing.  FAM  techniques  try  to  reduce  false  alarms  by 
exploiting  the  shape,  photometric,  polarity,  spectral  and  other  properties  of  the  mine 
targets  to  reject  likely  false  alarms.  Detailed  FAM  techniques  are  discussed  in  [Menon 
2005].  However  for  present  implementation,  these  techniques  are  not  modeled. 

2.4.  MINEFIELD  DETECTION 

Locations  of  targets  obtained  after  target  selection  and  thresholding  are  then  used 
to  detect  the  presence  of  the  minefield  and  eventually  evaluate  the  minefield  confidence 
metric  for  the  given  FoR.  The  minefield  detection  performance  depends  on  the  type  of 
minefield,  characteristics  of  the  background,  mine  level  performance  and  the  minefield 
algorithm.  In  a  typical  implementation  separate  algorithms  are  used  for  patterned  and 
scattered  minefields.  Numerous  algorithms  are  available  in  the  literature  for  these  two 
types  of  minefields.  The  empty  boxes  test  (EBT)  algorithm  [Lake  et  ah,  1997],  linear 
pattern  detection  [Malloy,  2003;  Muise  and  Smith,  1995],  robust  mine  detection 
algorithm  [Robins  and  Robinson,  1995],  and  Hough  line  transforms  [Carlson  et  ah,  1994] 
are  some  of  the  patterned  minefield  detection  algorithms  available  in  the  literature. 
Scatter  Number  [Earp,  2000b]  and  Scatter  Log  weighted  [Earp  et  ah,  1995]  are  some  of 
the  techniques  used  for  scattered  minefield  detection. 
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The  performance  of  the  system  at  the  minefield  detection  level  is  defined  in  terms 
of  the  minefield  probability  of  detection  and  the  corresponding  false  alann  rate  (false 
alarms  per  km  ).  The  system  specifications  are  often  defined  in  tenns  of  the  operating 
point  on  this  curve.  Another  factor  that  may  impact  the  evaluation  of  minefield 
performance  is  the  minefield  scoring  method  used.  Two  possible  scoring  methods  are 
also  discussed  in  Section  3.4. 
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3.  SIMULATION  SYSTEM 


This  section  presents  an  overview  of  the  simulation  system  which  is  being  used  to 
simulate  and  evaluate  the  perfonnance  of  a  typical  airborne  minefield  detection  system. 
A  graphical  user  interface  called  "SimulAMFD"  is  developed  to  facilitate  this  evaluation 
[Agarwal  and  Agarwal,  2006].  The  GUI  provides  an  input  interface  between  the  user  and 
the  modeling  software  and  allows  the  user  to  specify  different  design  parameters, 
evaluate  mine  and  minefield  level  perfonnance,  and  analyze  individual  mine  and 
minefield  detection  algorithms.  This  simulation  system  estimates  airborne  mine  and 
minefield  performance  under  different  sensor  and  minefield  layout  scenarios.  The 
methodologies  and  models  used  for  data  collection,  mine  detection,  and  minefield 
detection  are  discussed  below  separately.  The  simulation  system  allows  estimating  mine 
and  minefield  level  perfonnance  for  a  particular  choice  of  data  collection  parameters  and 
algorithms.  It  is  also  possible  to  compare  analytical  and  simulation-based  results  for 
selected  detection  scenarios  for  validation  purposes.  The  simulation  system  also  has  the 
flexibility  to  conduct  design  trade-off  by  comparing  perfonnance  between  different 
choices  of  sensor-related  parameters. 

3.1.  DATA  SIMULATION 

Data  collection  is  the  first  step  for  any  minefield  detection  system.  The  data  for 
the  simulation  are  generated  in  a  manner  that  closely  resembles  the  data  collection  in 
various  airborne  minefield  detection  programs.  A  large  number  of  parameters  are 
involved  in  the  simulation  of  data  for  an  airborne  minefield  detection  system.  Most  of 
these  parameters  are  used  to  model  the  flight  path  over  the  simulated  background  and 
minefield  scenario.  Table  3.1  gives  a  list  of  the  parameters  needed  for  modeling  and 
analysis  of  data  collection  scenarios.  The  parameters  shown  in  Table  3.1  are  directly 
provided  to  the  simulation  system.  A  number  of  other  parameters  are  also  used,  which  are 
derived  using  these  parameters.  The  input  and  derived  parameters  are  discussed  in  the 
following  subsections  in  more  detail. 
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Table  3.1.  Data  Collection  Parameters  for  Simulation 


DATA  COLLECTION  PAP 

1AMETERS 

Platform  Data 

Sensor  and  Gimbal  Data 

Background 

Data 

Minefield 

Data 

Nominal  Flight 
Speed  ( V) 

Camera  Resolution  (. NxM) 

Length 

Minefield 

Layout 

Variation  in 
Flight  Speed 

Camera  Orientation 

Breadth 

Minefield 

Distribution 

Nominal  Flight 
Altitude  ( A ) 

FOV  (FOVx ,  FOVy) 

Background 
Anomaly  Density 

Minefield 

Position 

Variation  in 
Flight  Altitude 

Frame  rate  ( H) 

Spatial 

Distribution 

Mine  Size 
and  Material 

Flight  Angle 

Number  of  Steps  per  FoR  (5) 

Anomaly 

Statistics 

Mine 

Statistics 

Flight  Position 

Side  step  Overlap  (Av) 

Number  of  Swath  per  FoR  ( S) 

3.1.1.  Platform  Data.  The  flight  speed  (knots)  and  corresponding  variation  are 
provided  directly  in  the  simulation  tool.  Similarly,  the  flight  altitude  (feet)  and  its 
corresponding  variation  are  also  predefined.  The  flight  angle  in  degrees  is  the  heading 
angle  of  the  flight  with  respect  to  the  X  axis  on  the  ground.  The  flight  angle  can  be  a 
fixed  number  or  may  have  some  variation.  The  variations  in  flight  speed,  flight  altitude, 
and  flight  angle  can  be  modeled  using  Unifonn  or  Gaussian  distribution.  For  this, 
appropriate  inputs  (minimum  and  maximum  values  for  Uniform  distribution  and  mean, 
standard  deviation  values  for  Gaussian  distribution)  are  provided  to  the  simulation 
system.  The  flight  position  is  the  position  where  the  flight  path  for  a  given  run  will  start. 
Similar  to  the  flight  angle,  the  position  can  be  a  fixed  X  and  Y  locations  or  it  can  be 
distributed  according  to  Uniform  distribution  for  which  the  start  and  end  positions  for 
both  the  X  and  Y  directions  will  be  provided. 

3.1.2.  Sensor  and  Gimbal  Data.  The  field  of  view,  number  of  rows  and  columns, 
and  altitude  define  the  GSD,  which  in  turn  defines  the  target  size.  Camera  Orientation 
defines  the  orientation  for  the  data  collection.  It  can  be  either  0  degree,  which  is  the 
default  orientation  for  step-stare  data  collection  (in  which  the  columns  of  an  image  frame 
are  in  the  in-flight  direction  and  the  rows  of  an  image  frame  are  in  the  across-  flight 
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direction),  or  90  degrees,  which  is  the  default  orientation  for  push  broom  data  collection. 
The  number  of  rows  and  columns  define  the  size  of  an  image  frame.  If  the  image  size  is 
given  by  NxM  in  pixels  and  FOVx  and  FOVv  are  the  field  of  view  in  V  and  V 

direction,  then  resolution/ground  sample  distance  (GSD)  of  the  sensor  (in  inches),  r  is 
given  as 


7T*A*FOVx  7r*  A*  FOV 

r  = - -  = - -  (inch) 

15  M  \5N 


(3.1) 


where  altitude  A  is  in  feet  and  FOVx  and  FOVv  are  in  degrees. 

Now,  the  length  (X)  and  width  (7)  of  the  image  (in  meters)  for  the  step-stare  mode 
is  given  by 


X  =  0.0254rM  ,  Y  =  0.0254 rN  (3.2) 

and  for  the  push  broom  mode  is  given  by 

X  =  0.0254 rN,  Y  =  0.0254rM  (3.3) 

The  frame  rate  is  defined  as  the  number  of  image  frames  per  second  and  it  is 
provided  in  Hz.  The  number  of  steps  depends  on  the  mode  of  the  data  collection.  For  the 
push  broom  mode,  the  number  of  steps  is  equal  to  one,  whereas  for  the  step  stare  mode, 
the  number  of  steps  is  greater  then  one  and  is  derived  based  on  the  requirements  of  swath 
width.  Side  step  overlap  Xy  can  be  controlled  by  the  gimbal,  and  it  is  taken  as  an 
independent  variable.  A  positive  value  of  Xv  represents  the  corresponding  fraction  of 
overlap  between  images  in  the  direction  perpendicular  to  the  flight,  and  a  negative  value 
represents  a  holiday  (or  a  gap)  between  frames.  The  number  of  steps  and  side  step 
overlap  along  with  other  parameters  define  the  swath  width  which  determines  the  width 
of  the  minefield  encountered  in  the  run  and  has  bearing  on  the  minefield  detection.  The 
swath  per  FoR  is  predefined  number  that  dictates  the  length  of  the  FoR. 
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In-flight  overlap  depends  on  the  frame  rate  of  the  sensor,  the  flight  speed,  and  the 
frame  length.  The  in-flight  overlap  Xx  is  given  by 


1=1  — 


Vs 


1.94384 XH 


(3.4) 


where  V  is  the  flight  speed  in  knots,  s  is  the  number  of  steps  per  swath,  and  H  is  the 
frame  rate  of  the  sensor  in  Hz.  The  swath  width  W  and  length  L  of  an  FoR  is  then 
calculated  as 

W  =  (l-Xy)sY  +  AyY,  L  =  (l-Ax)SX  +  AxX  (3.5) 

where  S  is  the  number  of  swaths  per  FoR. 

3.1.3.  Background  Data.  Background  data  such  as  length,  breadth  (meters),  and 
anomaly  density  (per  meter  square)  are  directly  provided  to  the  simulation  system. 
Length  and  breadth  define  the  size  of  the  simulated  background  area.  A  set  of  parameters 
is  needed  to  be  provided  to  define  distribution  of  anomaly  statistics.  Different  anomaly 
detection  (AD)  algorithms  can  be  used  to  generate  AD  values  for  the  simulation.  The 
anomaly  statistics  are  dictated  by  the  AD  algorithms.  AD  algorithms  calculate  the  test 
statistics  at  all  the  pixel  locations  on  the  image  to  find  those  pixels  which  are  statistically 
different  from  the  background.  This  test  statistics  depends  on  number  of  factors  such  as 
background  type,  AD  algorithm  used,  time  of  the  day,  etc.  This  test  statistics  can  then  be 
modeled  using  various  statistical  distributions.  RX  is  one  of  the  most  popular  AD 
algorithms,  whose  test  statistics  can  be  modeled  by  Beta  distribution  or  Gamma 
distribution  as  explained  in  [Ganju,  2006].  Another  distribution  that  can  be  used  to  model 
the  RX  anomaly  values  is  central  F  distribution  which  is  used  in  this  thesis  and  discussed 
in  Section  4.3.  Distributions,  other  than  these  can  also  be  used  to  model  the  test  statistics. 
Thus,  for  modeling  the  anomaly  values,  input  parameters  corresponding  to  a  particular 
distribution  are  provided  as  inputs  to  the  simulation  system.  The  parameters  can  be 
degrees  of  freedom  (numerator,  denominator)  for  a  central  F  distribution,  shape 
parameters  for  a  two-parameter  Beta  distribution,  and  a  shape  parameter  and  scale 
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parameter  for  a  two-parameter  Gamma  distribution.  Spatial  distribution  deals  with 
statistical  models  that  can  be  used  to  model  the  spatial  locations  of  these  anomaly  values. 
Similar  to  anomaly  statistics,  spatial  distribution  requires  inputs  for  some  mathematical 
(or  statistical)  models  used  to  model  the  spatial  locations  of  the  background  anomaly 
values.  Poisson  distribution  is  currently  implemented  to  model  the  spatial  distribution  of 
background  anomaly. 

3.1.4.  Minefield  Data.  In  the  present  tool,  nine  from  many  possible  minefield 
templates  used  in  minefield  deployment  [FM  20-32,  1998]  have  been  implemented. 
Among  these,  scattered  and  patterned  minefields  are  the  two  most  commonly  used 
templates.  Figures  3.1  and  3.2  show  representative  spatial  distributions  of  the  two 
minefield  templates,  respectively.  For  patterned  minefield,  mines  are  arranged  in  a  three 
rows  that  are  not  necessarily  straight  or  linear. 


Figure  3.1.  Minefield  Template  Used  for  Generating  a  Scattered  Minefield 


10-40 


10-40  meters 


300  meters 


~120  meters 


Figure  3.2.  Minefield  Template  Used  for  Generating  a  Patterned  Minefield 
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The  minefield  position  provides  information  about  the  locations  where  a 
minefield  is  placed  in  the  simulated  background.  Mine  statistics  describe  various 
statistical  models  that  are  used  to  model  the  anomaly  detection  statistics  at  the  locations 
of  mines.  Similar  to  the  false  alarm  statistics,  a  set  of  parameters  is  needed  to  be  provided 
for  the  distribution  of  mine  statistics.  One  such  model  is  discussed  in  section  4.1. 

Figures  3.3  and  3.4  represent  simulated  runs.  Figure  3.3  shows  a  simulated  run  in 
which  the  segment  passes  over  the  minefield  completely  and  thus  constitutes  a  minefield 
segment,  whereas  in  Figure  3.4  the  run  completely  misses  the  minefield.  The  flight  angle 
is  assumed  to  be  unifonnly  distributed  with  the  minimum  and  maximum  flight  angle 
(±A<9)  specified  as  ±10°.  The  altitude  and  flight  speed  is  assumed  to  be  constant  and  kept 

at  2050  ft  and  75  knots,  respectively.  The  flight  position  is  also  assumed  to  be  unifonnly 
distributed  along  the  'x'  direction  (minefield  depth)  and  kept  constant  along  the  '/ 
direction  (minefield  front).  The  background  point  locations  are  simulated  as  Poisson 
distribution  with  a  specified  density.  A  part  of  these  background  points  constitute  the 
mine  level  false  alarm  for  the  background.  In  the  current  simulation,  a  density  of  0.01 
background  points  per  meter  square  is  used  because  in  most  cases  the  highest  false  alarm 
rate  considered  at  the  mine  level  is  0.01  FA/m2.  The  minefield  layout  of  a  patterned  or 
scattered  minefield  is  designed  according  to  specified  templates.  In  all  simulations,  the 
flight  path  is  taken  to  be  approximately  perpendicular  to  the  minefield  front.  However,  a 
variation  of  approximately  15°  is  allowed.  Green  dots  ()  represent  the  background 
clutter,  and  red  diamonds  represent  the  mines.  Represented  segments  and  frames  are  also 
highlighted  along  with  the  flight  start  position  and  flight  angle. 

3.2.  EVALUATING  MINE  LEVEL  PERFORMANCE 

Once  the  data  are  generated  corresponding  to  the  provided  parameters,  the  next 
step  is  to  evaluate  the  mine  level  performance.  Mine  level  performance  depends  on  the 
AD  algorithm  (such  as  RX  etc.)  which  is  captured  by  the  Figure  2.1.  The  anomaly  values 
obtained  from  the  AD  algorithm  are  thresholded  and  for  each  threshold,  the  probability  of 
detection  (PD)  of  a  mine  and  corresponding  false  alarm  rate  (FAR)  are  calculated.  The 
PD  and  FAR  depend  on  the  distribution  of  the  anomaly  values  for  the  false  alarm  and 
mine  targets.  Mine  level  ROC  curves  are  then  drawn  to  plot  the  PD  against  the  FAR. 


Y  (m)  Y  (m) 
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Figure  3.3.  Simulated  Run  Passing  Over  the  Minefield  Completely 
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Figure  3.4.  Simulated  Run  Missing  the  Minefield  Completely 
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3.3.  THRESHOLDING  SCHEMES 

The  anomaly  detection  values  are  thresholded  to  obtain  the  set  of  most  likely 
anomalies  which  are  statistically  distinct  from  the  neighboring  background.  The  threshold 
plays  a  significant  role  because  the  targets  selected  at  this  level  participate  in  the 
minefield  level  detection.  The  selected  targets  should  be  such  that  a  reasonable  number  of 
false  alarms  are  passed  while  selecting  certain  levels  of  mines  for  effective  minefield 
detection.  A  number  of  thresholding  schemes  for  this  purpose  can  be  used.  Each 
thresholding  scheme  affects  the  probability  of  minefield  detection  and  corresponding 
false  alann  rate  in  a  different  manner.  Three  different  thresholding  methods  are  discussed 
in  this  section. 

3.3.1.  Fixed  Threshold.  This  is  a  straightforward  scoring  approach  in  which  only 
the  targets  with  detection  statistics  above  the  specified  threshold  are  allowed  to  take  part 
in  scoring.  The  threshold  is  provided  by  the  user  in  the  modeling  tool.  This  scoring 
approach  provides  reasonable  performance  for  the  cases  where  the  non-maximal 
suppressed  anomaly  values  follow  almost  the  same  statistical  distribution  for  all  FoRs. 
Thus,  this  scoring  scheme  can  yield  very  good  minefield  detection  perfonnance  if  the 
background  is  homogeneous.  It  is,  however,  often  impossible  a  priori  to  select  an 
appropriate  value  of  threshold  because  a  desired  value  of  the  threshold  may  differ  from 
terrain  to  terrain  due  to  differences  in  the  background  data.  The  threshold  will  also 
depend  on  the  time  of  day  due  to  shadows  and  other  effects.  Moreover,  due  to  the 
difference  in  background  features  from  one  FoR  to  another,  the  number  of  detections  may 
be  very  large  in  some  FoR  and  very  small  in  other  FoR. 

3.3.2.  Constant  Target  Rate.  The  constant  target  rate  (CTR)  implies  that  in  each 
FoR,  a  fixed  number  of  target  locations  with  the  highest  detection  statistics  are  selected. 
Number  of  targets  is  provided  as  an  input  in  the  modeling  tool.  Effectively  the  detection 
threshold  changes  from  one  FoR  to  another  FoR  in  this  case,  depending  on  the  selected 
target  rate  per  FoR.  The  minefield  level  performance  becomes  subjective  for  this 
thresholding  scenario  because  the  number  of  false  alarms  selected  per  FoR  remains  same 
irrespective  of  the  type  of  terrain.  The  minefield  detection  in  this  case  should  rely  on  a 
metric  different  then  the  number  of  detections.  The  detection  of  mines  for  this 
thresholding  scheme  may  result  in  poor  perfonnance  in  a  non-homogeneous  background 
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because  the  probability  of  selecting  some  fixed  number  of  mine  targets  in  a  highly 
cluttered  area  will  be  less  than  the  probability  of  detecting  the  same  number  of  mine 
targets  in  a  low  cluttered  area.  Moreover,  if  the  number  of  targets  is  not  selected 
appropriately,  then  it  may  result  in  poor  minefield  performance  as  well.  Thus,  this 
scheme  can  be  used  effectively  for  homogeneous  background,  but  not  for  non- 
homogeneous  background  due  to  the  above  stated  reasons. 

3.3.3.  Constant  False  Alarm  Rate.  In  CFAR  case,  the  detection  statistics  are 
modeled  by  appropriate  distribution  and  a  threshold  is  selected  for  a  desired  false  alarm 
rate.  The  false  alarm  rate  per  square  meter  ( pB  )  is  specified.  In  this  thresholding  scheme, 
the  expected  number  of  false  alarms  in  a  selected  area  is  a  constant  which  depends  on  the 
type  of  distribution  that  is  used  to  model  the  detection  statistics  and  the  goodness  of  lit  of 
the  model.  For  each  segment,  the  detection  statistics  are  modeled  by  an  appropriate 
distribution  and  the  threshold  is  selected  adaptively  as  shown  in  [Ramachandran,  2004]. 

A  particular  false  alarm  rate  can  be  used  depending  on  the  type  of  background  and 
terrain.  If  the  background  is  highly  cluttered  then  it  can  be  anticipated  that  for  a  higher 
detection  of  mine  targets,  the  false  alarm  rate  should  be  higher.  In  contrast,  for  areas  with 
low  natural  clutter,  a  lower  false  alarm  rate  can  be  used  to  achieve  higher  detection  of 
mine  targets.  Thus,  CFAR  is  a  very  effective  scheme  for  thresholding  mine  targets, 
resulting  in  possibly  better  mine  and  minefield  detection  performance. 

3.4.  EVALUATING  MINEFIELD  LEVEL  PERFORMANCE 

This  is  the  final  step  in  a  typical  airborne  minefield  detection  system.  The 
minefield  level  performance  depends  on  the  minefield  detection  algorithm,  mine  level 
performance,  thresholding  scheme,  and  the  minefield  scoring  method  used.  Because  most 
of  the  mines  are  either  scattered  or  laid  in  a  pattern  in  a  minefield,  most  of  the  minefield 
detection  algorithms  provide  an  indication  of  presence  in  the  form  of  a  minefield 
confidence  metric,  which  is  a  quantitative  measure  of  the  confidence  level  for  the 
presence  or  absence  of  a  minefield  in  that  area.  The  thresholded  anomaly  values,  along 
with  their  spatial  locations,  are  provided  as  an  input  to  these  minefield  detection 
algorithms  over  which  the  minefield  confidence  metric  is  derived.  Various  algorithms 
that  are  currently  being  implemented  are  pattern  linear,  pattern  regular  for  a  patterned 
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minefield,  and  scatter  number  and  scatter  log  weighted  for  a  scattered  minefield.  These 
algorithms  are  explained  in  detail  in  Section  7. 

Once  the  minefield  level  confidence  values  are  obtained  the  next  and  final  step  is 
to  score  the  minefield  for  its  performance  evaluation  in  tenns  of  an  ROC  curve.  This 
ROC  curve  will  then  represents  the  probability  of  detection  against  the  probability  of 
false  alann  for  a  minefield.  The  minefield  confidence  statistics  effectively  represent  the 
likelihood  that  a  minefield  is  present  in  this  FoR  or  the  likelihood  that  the  given  FoR  is 
collected  over  an  actual  minefield.  These  FoRs  may  be  non-overlapping,  or  a  sliding 
window  approach  is  used  to  define  an  overlapping  FoR.  Scoring  can  be  done  either  for  a 
segment/FoR  or  a  complete  run.  Both  of  these  scoring  methods  are  currently 
implemented  in  the  simulation  system. 

In  scoring  the  detection  performance,  the  simplest  approach  is  to  identify  FoRs 
that  are  actually  over  the  minefield  to  establish  the  ground  truth.  Once  this  ground  truth  is 
established,  each  FoR  can  be  scored  as  a  detection  or  false  alarm,  and  an  ROC  curve  can 
be  drawn.  For  FoR-wise  minefield  scoring,  one  input  is  required  in  the  simulation  system. 
If  the  input  is  between  0  and  1  (inclusive),  then  the  ratio  of  the  area  occupied  by  the  mines 
to  the  area  of  the  FoR  should  be  greater  than  the  input  to  be  called  a  minefield  FoR. 
However,  if  the  number  is  greater  than  1 ,  then  that  many  mines  should  be  detected  in  a 
FoR  for  it  to  be  called  a  valid  minefield  FoR.  The  main  problem  with  this  approach  is  that 
at  times  an  FoR  may  only  be  partially  over  the  minefield.  In  such  cases  one  would  have 
to  make  an  arbitrary  choice  of  when  to  call  a  given  FoR  as  belonging  to  a  minefield  and 
when  not  to  do  so.  The  evaluated  performance  is  significantly  dependent  on  this  choice. 
Moreover,  the  resulting  ROC  curves  give  the  probability  of  correct  classification  of  the 
FoR  and  not  the  probability  of  detection  of  the  minefield. 

An  alternative,  slightly  complicated,  but  more  representative  approach  is  to  score 
the  runs  and  not  the  individual  FoR.  In  this  case  the  scoring  is  based  on  actual  geo¬ 
locations  of  a  minefield.  The  minefield  is  defined  by  its  location  and  extent.  Any  run  that 
intercepts  the  minefield  suitably  is  said  to  be  a  minefield  run.  In  practice,  one  run  can 
have  more  than  one  minefield,  which  would  be  evaluated  as  independent  minefields.  A 
minefield  is  said  to  be  detected  if  any  FoR  that  falls  over  the  minefield  is  flagged  as 
containing  a  minefield.  Multiple  detections  over  the  same  minefield  location  are 
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neglected.  A  similar  approach  is  followed  for  false  alarms.  Any  FoR  that  does  not 
intercept  a  minefield  is  a  likely  candidate  for  a  false  alann.  However,  multiple  false 
alarms  that  hit  in  the  same  area  of  the  field  are  neglected.  Thus,  once  an  FoR  is  called  a 
false  alarm,  any  other  FoR  within  a  specified  distance  from  this  FoR  will  not  be  counted 
as  a  false  alann.  It  is  important  to  note  that  the  length  of  this  distance  over  which  further 
false  alarms  are  neglected  does  not  affect  the  measured  false  alarm  rate;  it  only  limits  the 
maximum  false  alarm  rate  that  could  be  reported.  Any  reasonable  size  such  as  100m  or 
200m  can  be  used.  However,  the  size  should  be  at  least  as  long  as  the  length  of  an  FoR.  If 
the  scoring  is  done  over  the  complete  run,  then  any  FoR  having  more  than  zero  mine 
targets  can  be  defined  as  a  valid  minefield  segment. 
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4.  ANOMALY  DETECTION  —  MODELING  RX  STATISTICS 

AD  algorithm  calculates  the  detection  statistics  at  all  the  pixel  locations  on  the 
image  to  find  the  pixels  that  are  statistically  different  from  the  background.  This  section 
discusses  the  modeling  of  the  statistics  for  an  Anomaly  Detection  (AD)  algorithm.  RX  is 
one  of  the  most  popular  algorithms  for  anomaly  detection  [Reed  and  Yu,  1990;  Holmes, 
1995].  Modeling  of  the  detection  statistics  for  RX  algorithm  is  discussed  in  this  section. 
Further  processing  such  as  False  Alarm  Mitigation  (FAM)  can  then  be  used  to  segregate 
the  possible  mine  targets  from  the  false  alarm  depending  on  various  feature  classification 
algorithms  such  as  circularity,  radial  symmetry,  and  gray  scale  moments  [Menon  et  al., 
2004],  For  the  present  discussion,  the  RX  algorithm  is  used  to  detect  the  possible  mine 
targets  from  the  background  anomaly.  False  alarm  mitigation  has  not  been  currently 
implemented. 

4.1.  MODELING  RX  ANOMALY  DETECTOR 

Various  anomaly  detection  algorithms  are  proposed  for  mine  level  detection  in  the 
literature  such  as  RX  [Reed  and  Yu,  1990;  Holmes,  1995],  and  Unmixing  Component 
Analysis  [Yanfeng  et  al.,  2006].  An  optimal  matched  response  based  on  a  locally 
estimated  first-order  Gauss-Markov  model  for  the  background  and  known  mine  template 
has  been  proposed  by  Liao  et  al.  for  anomaly  detection  [Liao  et  al.,  2001].  For  the  current 
discussion,  the  anomaly  detection  algorithm  considered  is  RX.  The  RX  algorithm  has 
become  the  de-facto  baseline  anomaly  detector.  This  algorithm  assumes  the  available 
images  to  be  zero  mean,  uncorrelated,  and  Gaussian  distributed.  This  assumption  is  fair 
enough  for  most  of  the  low-resolution  electro-optical  sensors  (although  many  images  are 
not  truly  Gaussian  distributed).  By  definition,  images  are  not  zero  mean;  however,  real 
images  can  often  be  assumed  to  have  a  slowly  varying  mean  value.  A  non-stationary 
local  mean  can  be  subtracted  from  the  image  to  generate  a  locally  zero  mean  image, 
which  is  then  passed  through  the  RX  anomaly  detector.  The  RX  detector  then  provides 
the  statistical  measure  for  the  presence  of  an  anomaly  at  each  pixel  location  in  the  image. 
However,  all  pixel  RX  values  are  not  used  because  local  neighborhood  values  are  often 
correlated  in  case  of  high  resolution  images.  Instead  the  local  maximum  in  the 
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neighborhood  can  be  considered  as  viable  anomalies.  This  is  achieved  by  non-max 
suppression,  which  can  be  viewed  as  a  filter  process  that  only  allows  the  maximum  value 
in  a  given  neighborhood  to  pass  while  suppressing  all  the  other  RX  values.  A  detailed 
explanation  of  the  RX  anomaly  detector  and  the  working  of  non-max  suppression  can  be 
found  in  [Ramachandran,  2004].  This  section  presents  a  brief  overview  of  the  RX 
algorithm  and  outlines  its  functional  working. 

The  RX  algorithm  involves  three  sets  of  masks:  the  target  mask,  the  blanking 
mask,  and  the  demeaning  mask.  Figure  4.1  shows  the  geometry  of  the  three  masks  used 
in  the  current  implementation.  Square  or  rectangular  mask  can  also  be  used.  The  target 
radius  (ty)  defines  the  target  mask  (Wj),  which  defines  the  size  of  the  target  signature, 
which  is  assumed  to  be  circular  in  shape.  The  annular  region  between  the  target  radius 
and  blanking  radius  ( rB )  defines  the  blanking  mask  and  specifies  the  region  that  is 
omitted  from  the  estimation  process.  This  region  is  excluded  because  it  may  typically 
contain  shadows  and  other  reflective  effects  of  the  target.  Inclusion  of  this  region  can 
distort  the  target  and  clutter  estimates.  The  third  mask  is  the  clutter  mask,  which  is 
defined  as  the  annular  region  between  blanking  radius  (rB)  and  demeaning  radius (rD), 
which  defines  the  extent  of  the  background  that  is  used  to  compute  the  background 
covariance.  This  demeaning  radius  also  defines  the  demeaning  mask,  which  is  used  to 
compute  the  local  mean  for  local  demeaning  of  the  raw  image. 


Target  Mask  (Wj) 


Blanking  Mask  (WB) 


Demeaning  radius  (rD) 
Blanking  radius  (rB) 
Target  radius  (rT) 


Background/Clutter 
Mask  (Wc) 


Figure  4.1.  Geometry  of  Masks  in  RX  Anomaly  Detection 
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The  target  radius  and  demeaning  radius  are  independent  of  each  other  and  provide 
a  good  estimate  of  the  target  statistics  and  background  statistics,  respectively.  Effectively 
this  RX  algorithm  accepts  the  raw  image  as  an  input  and  calculates  the  convolution  with 
the  set  of  the  above  three  masks.  Thus,  what  is  being  computed  is  effectively  the  signal  to 
clutter  ratio  for  each  pixel  in  the  image.  The  RX  output  provides  an  image  equal  in  size  to 
the  size  of  the  actual  image. 

The  RX  algorithm  is  generally  applicable  for  multi-band  images  with  zero  mean 
and  uncorrelated  Gaussian  background  [Reed  and  Yu,  1990].  For  an  image 
pixel  I(i,j)  =  I, ,  the  RX  statistics  r  for  a  /  band  image  is  given  by 

r  =  juTsM~ljus  (4.1) 

where  M  is  the  scaled  locally  estimated  covariance  matrix  of  dimension  J  xJ  given  by 


M-  l ]l[h 

!eWc 

calculated  over  Nc  =  7i(r^  -  rj)  clutter  pixel  in  the  window  Wc  and  /us  is  the  target 
signature  given  by 


Ms  = 


1 


where  NT  =  nr^  are  the  number  of  target  pixels  used  in  estimating  /us . 

The  scaling  of  RX  statistics  presented  here  is  slightly  different  from  that  used  in 
other  literature  [Menon,  2005;  Ganju,  2006]  to  ensure  consistency  with  the  origin  RX 
paper  [Reed  and  Yu,  1991].  The  difference  arises  because  of  the  way  M  and  jus  is 
defined.  The  RX  statistics  V  used  here  can  be  written  as 


r  =  (NTlNc)rx 
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where  rx  is  the  RX  statistics  defined  in  [Menon,  2005;  Ganju,  2006], 

Under  the  assumption  of  zero-mean  uncorrelated  Gaussian  variable,  the 
probability  density  functions  obtained  by  Reed  and  Yu  [Reed  and  Yu,  1990]  for  the  RX 
statistic  for  the  background  location  and  the  target  location  is  given  by, 
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(4.2) 


(4.3) 


where  f{r  |  II {])  is  the  probability  density  function  for  x  given  that  it  is  not  a  mine  target 
(null-hypothesis)  and  f(r\Hx)  is  the  probability  density  function  given  that  the  pixel 
location  belongs  to  a  mine  (non-null  hypothesis).  F{  (x;y;z)  is  the  confluent 
hypergeometric  function,  Nc  denotes  the  number  of  pixels  in  the  clutter  template  in  the 
neighborhood,  J  represents  the  number  of  bands,  and  B{y\rj)  denotes  the  beta  density 
function.  The  generalized  signal  to  noise  ratio  (GSNR)  or  scale  factor  '  a '  is  given  by 


a  =  (SCR)JNr 


(4.4) 


—2 

SCR  =  tL 
(J 


(4.5) 


where  //  is  the  mean  target  signature,  cr  is  the  standard  deviation  of  the  background  area, 
and  Nt  is  the  number  of  pixels  in  the  target  template. 
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The  RX  statistics  for  background  location  and  target  location  given  by  Equation 
(4.2)  and  Equation  (4.3)  can  also  be  modeled  by  the  central  F  and  non-central  F 
distribution.  The  central  F  distribution  is  defined  as  the  ratio  of  two  central  chi  squared 
variates  and  non-central  F  distribution  is  defined  as  ratio  between  non-central  chi  square 
and  central  chi  square  variates.  Central  F  and  non-central  F  distributions  are  defined  as 
[Johnson  et  al.,  1995]: 
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where  Vj  is  the  numerator  degrees  of  freedom  (and  denominator  degrees  of  freedom 
(DOFn)  and  v2  is  the  denominator  degrees  of  freedom  (DOFd)  and  a  is  the  GSNR. 
Comparing  Equations  (4.2)  and  (4.6),  the  RX  distribution  under  a  null  hypothesis  can  be 
easily  transfonned  into  central  F  distribution  with  transformation 


x  =  r^~  (4.8) 

vi 

where  v1  =  J  and  v2  =Nq-J  and  B(vJ2,v2/2)  is  the  complete  Beta  function. 

Thus,  RX  detection  statistics  can  be  modeled  into  an  F  distribution  by  scaling  the 
detection  statistics  by  J / (Nc  -  J ) .  However,  due  to  a  non-ideal  environment  and  with 
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different  terrain  types  and  local  correlation  in  the  background  data,  the  statistics  in 
Equation  (4.7)  often  does  not  follow  F  distribution.  However  it  is  postulated  that  a  scaled 
RX  statistics  is  defined  as 


x  =  k  —  r  (4.9) 

may  actually  follow  F  distribution.  This  scale  factor  tk>  is  very  similar  to  the  scale  factor 
‘  X  ’  used  in  [Ramachandran,  2004].  This  scale  factor  77  also  needs  to  be  estimated.  For 
the  present  case,  the  scaling  is  derived  using  the  statistical  method  as  described  in 
Appendix  C.  Figure  4.2  shows  some  F  distributions  with  different  DOFn  and  DOFd  . 


Central  F  distribution  for  various  values  of  DOFN  and  DOFp 


Figure  4.2.  PDF  for  a  Central  F  Distribution  with  Different  Values  of  DOFn  and  DOFd 


4.2.  MODELING  NON-MAX  SUPPRESSION 

The  signal  to  clutter  image  obtained  from  the  RX  algorithm  is  subjected  to  non¬ 
max  suppression  to  obtain  the  list  of  anomaly  values  which  are  highlighted  by  the 


32 


anomaly  detector.  Non-maximal  suppression  is  a  processing  algorithm  that  suppresses 
(makes  zero)  all  the  targets  in  a  specific  neighborhood  (R -pixel  radius)  except  the  local 
maximum.  The  target  list  so  obtained  is  effectively  the  list  of  row  and  column 
coordinates  of  the  potential  targets  along  with  the  AD  values  at  respective  row  and 
column  values.  The  operation  of  non-maximal  suppression  can  be  explained  with  the  help 
of  Figure  4.3.  Let  the  target  location  be  specified  by  l(i,j)  (marked  by  ‘X’  in  the  figure) 
and  let  R  be  the  radius  of  the  local  neighborhood.  Only  the  RX  values  inside  the  radius  R 
will  be  considered  for  the  non-maximal  operation.  The  RX  value  at  the  central  location  / 
is  set  to  zero  if  its  value  is  not  the  maximum  in  this  R  pixel  neighborhood;  otherwise  it  is 
left  as  it  is  and  will  be  a  potential  target.  The  same  operation  is  repeated  for  each  pixel 
location  in  the  image  and  all  local  maxima  are  selected. 


Target  with  local 


Let  the  function  g(l)  represent  the  mapping  function  performed  by  the  non-max 
operation,  and  let  Br  be  the  R-pixel  neighborhood.  Thus, 


g(0  = 


if  xl  =  Xt  =  local  maximum  in  B, 
elsewhere 
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If  f(x  |  H(j )  is  the  probability  density  function  (PDF)  under  a  null  hypothesis, 

then  the  probability  density  function  used  to  model  the  background  clutter  statistics  after 
non-max  suppression  is  given  by  [Ramachandran,  2004]: 


f(x\H  1  P~N^~Fo(x)) 

rNM  I  I  1  TJ  \  J  \x  \rl0)-e  (A  in\ 

/  (-v  g(7)  =  I,  f/(l )  =  - -  (4.10) 

\f(x\H0)e-m-FX‘"dx 


where  N  =  9A  is  the  expected  (average)  number  of  independent  targets  present  in  the 
neighborhood  Br ,  A  is  the  area  of  the  neighborhood  Br ,  9  is  the  density  of  targets,  and 
(x)  is  the  cumulative  distribution  function  (CDF)  for  fix  \  //() )  defined  as, 


W  =  fix  <  x  |  H0)  =  J/(x  |  H0)d  x 
o 


Similarly,  the  probability  density  function  used  to  model  the  mine  target  statistics 
after  non-max  suppression  is  given  by: 


f(x\H  \p-N(]-h°(x)) 

fNM  {x  |  g(l)  =  1,  Hx )  =  oo~/(  1  l}~ - - -  (4.11) 

\f(x\Hl).e~N(X~FoM)dx. 

o 

where  f(x  \  H\)  is  the  PDF  under  non  null  hypothesis. 

Figure  4.4  shows  the  post  non-max  F  distribution  for  various  values  of  DOFn, 
DOFd  and  N.  As  seen  from  the  Figure  4.4,  the  distribution  can  take  different  shapes 
depending  on  the  values  of  DOFn  and  DOFd-  Also,  a  larger  value  of  N  has  the  effect  of 
pushing  the  distribution  towards  larger  values. 
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Post  nonmax  Central  F  distribution  for  various  values  of  DOFN  and  DOFd 


x 


Figure  4.4.  Post  Non-max  PDF  for  an  F  Distribution  with  Various  DOF\  DOFd,  and  N 

Figure  4.5  shows  a  simulated  PDF  of  the  background  and  mine  RX  statistics  for  different 
GSNR  values  after  non-max  suppression.  As  shown  in  the  PDF,  the  detected  number  of 
mines  and  detected  anomalies  are  different  for  a  given  threshold.  The  PDF  for 
background  RX  values  is  drawn  using  Equation  (4.10)  and  corresponding  mine  RX 
values  is  drawn  using  Equation  (4.1 1).  The  background  and  mine  RX  values  are  assumed 
to  be  modeled  as  central  F  and  noncentral  F  distribution,  respectively.  GSNR  is 
responsible  for  the  statistical  difference  in  the  background  RX  (blue)  and  mine  RX  (red, 
magenta  and  black)  values,  which  is  evident  from  the  Figure  4.5. 

4.3.  PARAMETER  ESTIMATION  FOR  RX  DETECTIONS 

Modeling  of  detection  statistics  is  very  useful  because  it  provides  valuable 
information  about  the  spatial  correlation  and  non-homogeneity  in  the  data.  In  order  to 
compare  the  performance  of  the  RX  anomaly  detector  for  various  terrains  (sparsely 
vegetated,  densely  vegetated,  and/or  dirt)  and  for  different  times  of  day  (morning  time, 
afternoon  time),  it  is  very  important  to  model  the  RX  statistics  using  probabilistic  models. 
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RX  value  distribution  of  BKG  and  mines  with  different  GSNR  (  vl  =  3,  v2  =  15,  N  =  70) 


Figure  4.5.  PDF  for  a  Post  Non-Max  Background  Anomaly  and  Mine  Anomaly  Statistic 

Value  for  Different  GSNR  Values 


Previously,  RX  detection  statistics  have  been  successfully  modeled  using  Beta 
and  Gamma  distributions  [Ganju,  2006;  Webb,  2000;  Copsey  and  Webb,  2001;  Huiyan  et 
al.,  2005].  In  this  section,  the  RX  detection  statistics  will  be  modeled  with  a  central  F 
distribution.  The  model  estimation  for  the  central  F  distribution  is  done  using  the  EM 
(Expectation  Maximization)  technique.  Various  statistical  tests  are  applied  to  check  the 
goodness  of  fit  of  the  modeled  distribution  with  the  actual  data. 

As  explained  in  Section  4.1,  the  background  RX  statistics  can  be  modeled  by  an  F 
distribution  i.e., 
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template  -  number  of  spectral  bands). 
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Once  the  non-max  suppression  is  performed,  the  post  non-max  RX  statistics 
becomes 


fNM{r,v„v2,N)  = 


f{r)e~N  (1-FW) 

oo 

J  f(r)e-N(l-F(r»dr 
0 


(4.13) 


where  F(r)  is  the  CDF  of  / (r)  and  N  is  the  number  of  pixels  in  the  local 
neighborhood. 

The  EM  algorithm  is  used  to  estimate  the  three  parameters  ( vt ,  V2 ,  and  N  )  for  the 
post  non-max  central  F  distribution.  Initial  parameters  that  are  used  to  start  the  EM 
algorithm  are  very  important  and  must  be  carefully  chosen.  Method  of  moments  is 
currently  used  to  derive  the  initial  parameters.  If  there  are  ‘/f  parameters  to  estimate,  then 
the  first  ‘/f  sample  moments  are  equated  to  the  actual  moments  of  the  distribution,  given 
that  the  actual  moments  are  functions  of  the  parameters  of  interest.  Other  details  and 
actual  derivation  for  the  initial  parameters  is  discussed  in  Appendix  C. 

Estimation  is  carried  out  in  two  steps.  The  first  step  deals  with  the  formulation  of 
an  update  equation  for  the  central  F  distribution  and  the  second  step  explains  the  use  of 
that  update  equation  in  estimating  the  parameters.  Appendix  B  explains  the  EM  algorithm 
with  its  mathematical  fonnulation  and  update  equation  for  estimating  the  parameters  for 
RX  modeling.  Once  the  estimation  is  completed,  various  statistical  tests  are  used  to 
measure  the  goodness  of  fit  of  the  estimated  parameters.  These  tests  are  described  in 
detail  in  Appendix  D.  The  confidence  level  for  the  current  results  is  taken  to  be  0.97.  If 
the  chi  square  test  statistics  for  a  particular  segment  is  less  than  the  threshold,  then  the 
segment  is  said  to  pass  the  test;  otherwise  the  segment  is  said  to  fail  the  test.  Here  passing 
the  test  implies  that  a  reasonable  model  is  obtained  for  the  RX  statistics  for  the  given 
segment.  Thus  depending  on  the  confidence  level,  for  the  present  case  it  is  expected  that 
97%  (because  the  confidence  level  is  set  to  0.97)  of  the  segments  for  a  given  dataset  will 
pass  the  test  and  3%  will  fail.  If  the  percentage  of  failure  is  much  higher  than  3%,  then 
the  modeling  is  bad;  otherwise  it  is  good. 
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4.4.  MODELING  RESULTS 

As  mentioned  before,  in  the  past  RX  statistics  have  been  modeled  using  Beta  and 
Gamma  parameters.  However  in  all  previous  work,  frame-wise  modeling  has  been  used. 
For  the  present  case,  an  attempt  has  been  made  at  segment  wise  modeling  of  the  RX 
statistics  with  a  central  F  distribution.  Estimation  using  Beta  and  Gamma  distributions  is 
still  possible  but  has  not  been  implemented  here. 

After  the  non-max  operation  there  are  about  350  samples  in  each  frame.  With  21 
frames  per  segment,  there  are  approximately  7000  samples  per  segment.  However,  all 
these  samples  are  not  used  for  modeling,  instead,  all  the  samples  within  20  pixels  of  the 
four  edges  of  the  frames  are  ignored  so  as  to  remove  the  bias  caused  by  corresponding 
edge  pixels.  This  will  reduce  the  total  number  of  samples  used  for  estimation. 

The  modeling  is  done  for  eight  different  types  of  datasets  depending  on  the  time 
of  day  (morning  background  segments,  afternoon  background  segments)  and  background 
type  (sparse  vegetation  background  segments,  dense  vegetation  background  segments). 
Morning  segments  are  chosen  between  8:00  am  and  9:00  am,  and  afternoon  segments  are 
chosen  between  2:00  pm  and  3:00  pm.  For  comparison  between  the  background  and 
minefield  segments,  both  background  only  and  minefield  segments  are  used  for  the 
modeling. 

For  the  background  segments,  all  21  frames  (corresponding  to  seven  swath  and 
three  steps)  are  used  for  the  modeling.  However,  only  clean  frames  (without  any  fiducial 
or  manmade  artifact)  are  used  for  modeling  in  case  of  the  minefield  segments.  The 
minefield  segments  are  visually  inspected  to  ignore  any  frame  with  fiducials  and  other 
artifacts  for  the  modeling  purposes.  Approximately  50  segments  (equivalent  to  1050 
frames)  are  being  used  to  generate  the  modeling  results  for  all  eight  cases.  Three  different 
target  radiuses  (rT)  of  zero,  one,  and  two  are  used  for  comparison. 

Modeling  is  done  for  both  single  band  and  multiband  data.  An  individual  MSI 
band  4  (NIR)  is  used  to  represent  the  single  band  data,  and  RGB  colored  band  as  well  as 
all  four  MSI  bands  are  used  for  the  multiband  modeling.  Table  4.1  shows  the  number  of 
segments  and  the  exact  number  of  frames  corresponding  to  each  dataset  used  for  the 
modeling.  Thus,  a  total  of  24  sets  (eight  datasets  and  three  target  radii  for  each  dataset) 
are  being  modeled  and  the  fit  is  checked  for  the  central  F  distribution. 
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Table  4. 1 .  Data  Type  with  Number  of  Segments  and  Frames  Used  for  Modeling 


Number  of  Segments  Used 

Total  Number  of 
frames  used 

Morning  Background 
Segments 

40 

40x21  =840 

Morning  Minefield 
Segments 

45 

687 

Afternoon  Background 
Segments 

40 

40x21  =840 

Afternoon  Minefield 
Segments 

47 

736 

Sparse  Vegetation 
Background  Segments 

80 

80x21 =  1680 

Sparse  Vegetation 
Minefield  Segments 

50 

790 

Dense  Vegetation 
Background  Segments 

59 

59x21 =  1239 

Dense  Vegetation 
Minefield  Segments 

68 

886 

Figure  4.6  shows  a  representative  segment  from  sparse  vegetation  background. 
The  corresponding  distribution  fit  for  inverse  CDF,  and  PDF  is  shown  in  Figure  4.7,  for 
RGB  colored  registered  segment  and  target  radius,  rT  =  2 .  The  sample  values  for  the  F 
distribution  are  plotted  on  the  V  axis,  and  corresponding  CDF  and  PDF  values  are 
plotted  on  the  V  axis.  First  subplot  shows  the  inverse  CDF  for  the  actual  RX  values 
(blue)  and  the  estimated  values  (broken  red)  and  corresponding  PDF  is  shown  in  the 
second  subplot.  Third  subplot  shows  the  chi  square  bin  error  values.  Estimated  and  initial 
parameters  for  the  EM  algorithm  are  shown  in  the  title  of  the  second  subplot  of  Figure 
4.7.  The  initial  ‘v/  and  ‘v/  are  derived  using  the  method  of  moments  as  discussed  in 
Appendix  C.  An  initial  value  of  N  is  chosen  to  be  100.  The  pass  or  fail  value  (one  or 
zero,  respectively)  is  also  shown  in  the  title  of  the  third  subplot  of  Figure  4.7. 

Figure  4.8  shows  a  representative  segment  for  a  dense  vegetation  minefield 
segment  and  Figure  4.9  shows  corresponding  distribution  fit  for  inverse  CDF,  and  PDF 
for  the  actual  and  estimated  samples  along  with  the  chi  square  bin  error.  The  RGB 
colored  segment  with  rT  =  2  is  used  for  the  distribution  fit  in  this  case  also. 
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Figure  4.6.  Sparse  Vegetation  Background  Segment 


Inverse  CDF 


Tables  4.2  -  4.9  show  the  pass  percentages  for  sparse  vegetation  background 
segments,  dense  vegetation  background  segments,  sparse  vegetation  minefield  segments, 
dense  vegetation  minefield  segments,  morning  background  segments,  afternoon 
background  segments,  morning  minefield  segments,  and  afternoon  minefield  segments, 
respectively. 
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The  frames  with  fiducials  are  ignored  through  visual  inspection  of  the  frames  for 
the  modeling  purposes  in  the  case  of  minefield  segments  to  prevent  any  biasing  caused 
due  to  the  fiducials. 


Figure  4.8.  Dense  Vegetation  Background  Segment 


Inverse  CDF 
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Figure  4.9.  Distribution  Fit  for  Dense  Vegetation  Background  Segment 


41 


Table  4.2.  Pass  Percentages  for  Sparse  Vegetation  Background  Segments 


Sparse  Vegetation  Background  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

83.75 

85.00 

90.00 

RGB  Colored 

90.00 

88.75 

96.25 

All  4  MSI  Bands 

81.00 

95.00 

97.50 

Table  4.3.  Pass  Percentages  for  Dense  Vegetation  Background  Segments 


Dense  Vegetation  Background  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

66.10 

83.05 

92.98 

RGB  Colored 

81.36 

100.00 

96.61 

All  4  MSI  Bands 

81.03 

87.70 

98.30 

Table  4.4.  Pass  Percentages  for  Sparse  Vegetation  Minefield  Segments 


Sparse  Vegetation  Minefield  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

80.00 

96.00 

98.00 

RGB  Colored 

92.00 

98.00 

100.00 

All  4  MSI  Bands 

90.00 

94.00 

98.00 

Table  4.5.  Pass  Percentages  for  Dense  Vegetation  Minefield  Segments 


Dense  Vegetation  Minefield  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

91.10 

94.11 

95.59 

RGB  Colored 

92.60 

98.00 

98.53 

All  4  MSI  Bands 

100.00 

97.00 

97.02 
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Table  4.6.  Pass  Percentages  for  Morning  Time  Background  Segments 


Morning  Time  Background  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

65.00 

77.50 

90.00 

RGB  Colored 

75.00 

95.00 

97.50 

All  4  MSI  Bands 

70.00 

92.50 

97.44 

Table  4.7.  Pass  Percentages  for  Afternoon  Time  Background  Segments 


Afternoon  Time  Background  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

72.50 

85.00 

95.00 

RGB  Colored 

75.00 

95.00 

97.50 

All  4  MSI  Bands 

72.50 

87.50 

90.00 

Table  4.8.  Pass  Percentages  for  Morning  Time  Minefield  Segments 


Morning  Time  Minefield  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

75.56 

100.00 

100.00 

RGB  Colored 

82.22 

97.78 

97.78 

All  4  MSI  Bands 

75.56 

95.46 

95.56 

Table  4.9.  Pass  Percentages  for  Afternoon  Time  Minefield  Segments 


Afternoon  Time  Minefield  Segments 

Band 

Pass  Percentages  (%)  for  97%  coni 

Fidence  level,  RX 

Target  Radius  0 

Target  Radius  1 

Target  Radius  2 

NIR 

72.34 

95.75 

97.83 

RGB  Colored 

87.23 

97.87 

97.87 

All  4  MSI  Bands 

81.82 

97.67 

97.78 

As  shown  from  Tables  4.2  -  4.9,  the  pass  percentages  improve  with  an  increase  in 
the  target  radius.  Moreover,  as  can  be  recalled  from  Section  4.3,  at  least  97%  of  the 
segments  should  pass  the  chi  square  test  to  expect  good  modeling.  Modeling  seems  to  be 
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poor  for  target  radius  of  zero  for  most  of  the  datasets.  Reason  for  this  behavior  could  be 
high  propensity  of  single  pixel  outliers  in  the  image  data.  However,  in  most  of  the 
practical  applications,  a  target  radius  of  0  is  not  used  so  it  can  be  ignored.  However,  as 
the  target  radius  increases,  the  percentage  of  the  segments  passing  the  chi  square  test 
improves  with  some  exceptions.  For  target  radii  of  one  and  two,  the  pass  percentages  for 
most  of  the  datasets  is  near  97%  and  hence  modeling  can  be  considered  good.  This  is  also 
expected  because  with  higher  target  radius  and  hence  lower  effective  resolution,  the 
image  is  effectively  independent  Gaussian. 

Table  4.10  shows  the  distribution  of  segments  among  different  datasets  in 
accordance  with  the  confidence  level.  The  total  number  of  segments  for  each  case  of 
different  target  radii  among  all  the  datasets  is  also  mentioned  in  brackets.  As  noted 
earlier,  the  confidence  level  is  set  to  0.97  and  hence  the  segments  are  characterized 
accordingly  into  four  major  categories: 

Category  1  (threshold  <  0.95):  This  category  represents  those  segments  that  easily 
passed  the  chi  square  test; 

Category  2  (0.95  <  threshold  <  0.97):  This  category  represents  those  segments  that 
barely  passed  the  chi  square  test; 

Category  3  (0.97  <  threshold  <  0.99):  This  category  represents  those  segments  that 
barely  failed  the  chi  square  test  and; 

Category  4  (threshold  >  0.99):  This  category  represents  those  segments  that  easily  failed 
the  chi  square  test. 

Category  4  consists  mainly  of  those  segments  that  are  highly  non-homogeneous. 
Examples  of  these  segments  are  shown  in  Figures  4.10  and  4.14,  which  represent  the 
non-homogeneity  of  the  background. 

Also  for  a  small  target  radius,  the  tail  of  the  distribution  is  heavy  compared  with  a 
larger  target  radius  especially  for  the  cases  when  most  of  the  frames  are  dirt  frames.  This 
is  demonstrated,  in  Figure  4.10  which  displays  the  dirt  only  frame.  Figure  4.11  shows 
both  the  actual  as  well  as  estimated  PDF  for  the  same  FoR  (NIR  band  only)  with  target 
radii  of  0  and  2.  It  can  also  be  seen  that  as  the  target  radius  increases,  the  heaviness  of  the 
tail  decreases.  However,  for  both  the  target  radii,  this  segment  passes  the  chi  square  test. 


Table  4.10.  Distribution  of  the  Segments  Modeled  by  Central  F  According  to  the  Confidence  Level 


NIR  Band 

RGB  Band 

All  4  MSI  Bands 

Dataset 

Name 

Threshold 

Range 

Target 
Radius  =  0 

Target 
Radius  =  1 

Target 
Radius  =  2 

Target 
Radius  =  0 

Target 
Radius  =  1 

Target 
Radius  =  2 

Target 
Radius  =  0 

Target 
Radius  =  1 

Target 
Radius  =  2 

Dense 
Veg.  Bkg. 
(59) 

<  0.95 

37(59) 

48(59) 

52(57) 

47(59) 

56(59) 

55(59) 

46(58) 

48(57) 

58(59) 

0.95  to  0.97 

2 

1 

1 

1 

3 

2 

1 

2 

0 

0.97  to  0.99 

5 

3 

1 

4 

0 

1 

3 

3 

1 

>  0.99 

15 

7 

3 

7 

0 

1 

8 

4 

0 

Dense 
Veg.  MF. 
(68) 

<  0.95 

59(68) 

63(68) 

61(68) 

63(68) 

66(68) 

67(68) 

64(65) 

64(67) 

64(67) 

0.95  to  0.97 

3 

1 

4 

0 

1 

0 

1 

1 

1 

0.97  to  0.99 

2 

1 

1 

4 

1 

1 

0 

1 

1 

>  0.99 

4 

3 

2 

1 

0 

0 

0 

1 

1 

Sparse 
Veg.  Bkg. 
(80) 

<  0.95 

64(80) 

64(80) 

71(80) 

69(80) 

68(80) 

75(80) 

59(80) 

72(80) 

76(80) 

0.95  to  0.97 

3 

4 

1 

3 

3 

2 

7 

4 

2 

0.97  to  0.99 

3 

4 

2 

6 

4 

2 

3 

3 

2 

>  0.99 

10 

8 

6 

2 

5 

1 

11 

1 

0 

Sparse 
Veg.  MF. 
(50) 

<  0.95 

39(50) 

47(50) 

48(50) 

44(50) 

48(50) 

47(50) 

43(50) 

46(50) 

49(50) 

0.95  to  0.97 

1 

1 

1 

2 

1 

3 

2 

1 

0 

0.97  to  0.99 

5 

1 

0 

3 

0 

0 

1 

2 

0 

>  0.99 

5 

1 

1 

1 

1 

0 

4 

1 

1 

Morning 
Time  Bkg. 
(40) 

<  0.95 

19(40) 

31(40) 

34(40) 

31(40) 

36(40) 

39(40) 

24(40) 

37(40) 

38(39) 

0.95  to  0.97 

7 

0 

2 

1 

2 

0 

4 

0 

0 

0.97  to  0.99 

4 

2 

2 

2 

2 

0 

5 

0 

0 

>0.99 

10 

7 

2 

6 

0 

1 

7 

3 

1 

Morning 
Time  MF. 
(45) 

<  0.95 

32(45) 

40(45) 

45(45) 

33(45) 

44(45) 

43(45) 

32(45) 

42(45) 

42(44) 

0.95  to  0.97 

2 

5 

0 

4 

0 

1 

2 

1 

0 

0.97  to  0.99 

4 

0 

0 

2 

0 

1 

3 

0 

1 

>0.99 

7 

0 

0 

6 

1 

0 

8 

2 

1 

Afternoon 
Time  Bkg. 
(40) 

<  0.95 

27(40) 

34(40) 

36(40) 

28(40) 

36(40) 

39(40) 

29(40) 

31(40) 

34(40) 

0.95  to  0.97 

2 

2 

2 

2 

2 

0 

0 

4 

2 

0.97  to  0.99 

2 

3 

0 

2 

1 

0 

7 

0 

1 

>0.99 

9 

1 

2 

8 

1 

1 

4 

5 

3 

Afternoon 
Time  MF. 
(47) 

<  0.95 

31(47) 

44(47) 

45(46) 

41(47) 

45(47) 

46(47) 

35(44) 

40(43) 

40(45) 

0.95  to  0.97 

2 

1 

0 

0 

1 

0 

1 

2 

4 

0.97  to  0.99 

4 

0 

0 

2 

1 

1 

1 

0 

0 

>0.99 

10 

2 

1 

4 

0 

0 

7 

1 

1 

4^ 

4^ 


45 


Figure  4.10.  Sparse  Vegetation  Background  Segment  Predominantly  Dirt 
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Figure  4.11.  Actual  and  Estimated  PDF  for  the  NIR  Band  for  Target  Radii  Zero  and  Two 


Another  interesting  case  is  for  the  segment,  which  consists  of  various 
backgrounds.  For  these  mixtures  of  backgrounds,  the  EM  algorithm  fails  to  provide  an 
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accurate  fit  for  the  F  distribution  in  accordance  with  the  chi  square  test.  An  example  for 
this  mixture  of  backgrounds  is  shown  in  the  Figures  4.12  and  4.14.  Figure  4.12  shows  the 
sparse  vegetation  segment  with  two  different  backgrounds.  As  shown  from  the  segment, 
some  of  the  frames  (Frames  13-21)  represent  a  bright  path  (encircled  in  broken  cyan)  in 
contrast  to  other  non-bright  frames  (Frames  1-12).  For  this  bright  background,  the  EM 
algorithm  fails  to  find  an  accurate  lit.  Flowever,  for  the  rest  of  the  frames  (non-bright), 
the  distribution  passes  the  chi  square  test.  This  is  common  behavior  for  these  types  of 
background  mixtures  in  which  a  whole  segment  is  punished  due  to  a  non-homogeneous 
background.  Figure  4.13  shows  the  actual  and  estimated  PDF  and  inverse  CDF  for  bright 
and  non  bright  background.  Figure  4.13(a)  shows  the  actual  and  estimated  PDF,  and 
Figure  4.13(b)  shows  the  corresponding  inverse  CDF. 


Figure  4. 12.  Segment  Showing  the  Non-FIomogeneous  Dirt  Background 


It  is  worthwhile  to  note  that  EM  modeling  fails  for  the  bright  area,  whereas  it 
passes  for  the  rest  of  the  segment.  This  is  also  visible  from  Figure  4.13  (b)  in  which  the 
actual  inverse  CDF  (blue)  and  estimated  inverse  CDF  (broken  red)  are  quite  off,  which 
results  in  failure.  Also,  the  actual  PDF  for  the  bright  area  (solid  blue)  is  noisier  than  the 
actual  PDF  for  the  rest  of  the  segment  (broken  green). 

The  presence  of  different  and  non-homogeneous  backgrounds  is  one  of  the  main 
reasons  for  the  failure  in  distribution  fit.  In  this  case,  the  background  is  either  dirt  or 


47 


dense  vegetation.  If  the  segment  is  bifurcated  on  the  basis  of  the  background,  i.e.,  if  the 
frames  for  the  dirt  and  vegetation  from  the  same  segment  are  separated  and  then  passed 
separately  for  the  modeling  then  it  is  evident  that  for  the  vegetative  frames,  the  fit  passes 
the  chi  square  test  but  fails  for  the  dirt  frames.  Also,  the  tail  for  the  dirt  RX  samples  is 
heavier  than  for  the  vegetative  RX  samples.  A  segment  with  two  different  backgrounds 
(dirt  and  dense  vegetation)  is  shown  in  Figure  4.14.  Figure  4.15  shows  the  actual  and 
estimated  PDF  and  inverse  CDF  for  two  backgrounds.  Figure  4.15(a)  shows  the  actual 
and  estimated  PDF,  and  Figure  4.15(b)  shows  the  corresponding  inverse  CDF. 


PDF  for  various  bkgs. 


Inverse  CDF  for  various  bkgs. 


Figure  4.13.  Actual  and  Estimated  PDF  and  Inverse  CDF  for  Bright  and  Non-Bright 

Background 


As  evident  from  Figure  4. 15,  for  both  backgrounds  the  sample  RX  values  are  very 
different.  For  vegetation,  the  actual  and  estimated  PDF  are  very  much  on  top  of  each 
other  and  hence  pass  the  chi  square  test;  however,  for  a  dirt  background,  the  RX  values 
are  quite  different  and  noisier.  Also,  in  contrast  to  the  vegetation  frames,  the  dirt  PDF 
shows  a  heavy  tail.  Moreover,  for  the  dirt,  the  actual  and  inverse  PDF  does  not  have  a 
good  fit,  which  causes  the  dirt  background  RX  values  to  fail  the  chi  square  test.  The 
difference  between  the  two  backgrounds  is  also  visible  in  the  inverse  CDFs. 
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Figure  4.14.  Segment  Showing  Different  (Dirt  and  Vegetation)  Background 


x  x 

(a)  Actual  and  Estimated  PDF  (b)  Actual  and  Estimated  Inverse  CDF 


Figure  4.15.  Actual  and  Estimated  PDF  and  Inverse  CDF  for  Dirt  and  Dense  Vegetation 

Background 


4.5.  OTHER  ANOMALY  DETECTION  TECHNIQUES 

The  RX  anomaly  detector  is  one  of  the  most  popular  techniques  used  for  anomaly 
detection.  Some  derivatives  of  the  RX  detector  such  as  the  normalized  RX  detector  and 
modified  RX  detector  are  also  available  [Chang  and  Chiang,  2002].  Number  of  other 
anomaly  detection  algorithms  have  been  proposed  and  used  depending  on  the  situation 
and  other  factors.  Some  of  these  are  unmixing  component  analysis  [Yanfeng  et  ah,  2006], 
kernel  principal  component  analysis  [Gu  et  ah,  2006],  cluster-based  anomaly  detection 
[Carlotto,  2005],  signal  subspace  processing  [Ranney,  2006],  support  vector  data 
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description  [Banerjee  et  al.,  2006],  and  anomaly  detection  based  on  multi  resolution 
features  [Shadhan  and  Cohen,  2006]. 

False  alarm  mitigation  (FAM)  techniques  are  other  general  techniques  that  can  be 
used  for  anomaly  detection.  In  general,  FAM  aims  to  reduce  false  alarms  based  on  the 
shape,  photometric,  polarity,  spectral,  and  other  properties  of  the  mine  targets  to  reject 
likely  false  alanns.  FAM  techniques  are  dependent  on  a  number  of  factors  such  as  the 
time  of  day,  size  and  nature  of  the  mine  target,  nature  of  the  terrain,  and  nature  of  the 
local  background  of  the  target.  Moreover,  FAM  techniques  exploit  the  characteristics  of 
likely  mine  signatures  to  reduce  false  alarms.  Thus,  FAM  serve  the  dual  purpose  of 
providing  a  measure  of  mine  level  detection  perfonnance  as  well  as  other  features  that 
can  be  used  to  understand  the  nature  of  mine  targets  and  false  alarms  [Menon  et  al., 
2004].  Circularity,  radial  symmetry,  and  gray  scale  moments  are  some  of  the  false  alarm 
techniques  that  can  be  used  as  false  alarm  mitigation  techniques.  Detailed  description  of 
these  techniques  can  be  found  in  [Menon  et  al.,  2004]. 

Spectral  vegetation  indices  can  also  be  used  to  detect  the  presence  of  live  green 
vegetation  and  hence  for  false  alarm  mitigation.  These  indices  are  generated  by 
combining  data  from  multiple  spectral  bands  into  a  single  value.  More  about  these  indices 
is  discussed  in  Appendix  A. 
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5.  SPATIAL  DISTRIBUTION 


This  section  discusses  the  spatial  distribution  of  the  targets  obtained  after  the 
anomaly  detection.  The  spatial  characteristics  (physical  locations)  of  these  RX  detections 
can  be  studied  and  a  spatial  distribution  can  be  modeled  to  fit  the  spatial  location  of 
potential  mine  detections.  Once  the  modeling  for  the  spatial  distribution  is  successful,  the 
model  can  be  incorporated  into  simulation. 

5.1.  SPATIAL  POINT  PROCESSES 

A  collection  of  points  each  representing  the  location  of  an  event  in  space  can  be 
tenned  the  spatial  point  process  [Moller  and  Waagepetersen,  2006].  Mathematically,  the 
spatial  point  process  is  defined  in  the  following  manner  [Cressie,  1991]. 

Let  s  e  Rd  be  a  generic  data  location  in  a  J-dimensional  Euclidean  space,  and 
suppose  the  potential  datum  Z(s)  at  spatial  location  s  is  a  random  variable.  Now  let  s  vary 
over  index  setD  a  Rd ,  then  the  generated  multivariate  random  field  or  random  process 

|Z(s) :  s  e  D}  (5.1) 

is  termed  the  random  point  process.  Realization  of  such  a  process  is  called  the  spatial 
point  pattern  [Moller  and  Waagepetersen,  2006],  z  of  n  points  such  that, 

z  =  {z],z2, . ,zn}  ,n  >  0  points  contained  in D  (5.2) 

For  the  current  discussion,  the  point  processes  are  limited  to  2D  space  domains 
(R  ).  If  the  datum  Z(s)  represent  only  the  location  of  point,  it  is  called  ‘simple’  point 
process  and  if  it  contains  some  other  markings  like  color,  it  is  called  ‘marked’  point 
process.  Important  inferences  about  the  spatial  occurrences  of  events  can  be  drawn  from 
the  simple  point  process  Z(s),  where  each  point  in  such  a  random  process  represents  the 
location  of  some  event.  A  large  variety  of  events  are  possible,  however,  for  the  purpose 
of  the  current  discussion,  the  event  is  the  occurrence  of  landmines  or  a  false  alarm. 
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Spatial  point  patterns  play  a  vital  role  in  a  wide  variety  of  scientific  and 
engineering  problems  such  as  ecology,  telecommunication,  forestry,  biostatistics, 
geology,  and  economics.  These  patterns  have  also  found  applications  in  fields  as  diverse 
as  archeology,  cosmology,  and  seismology.  However,  the  application  of  spatial  point 
processes  in  the  field  of  minefield  detection  is  different  from  many  other  applications  in  a 
way  that  for  minefield  detection,  the  spatial  locations  are  a  mixture  of  different  processes 
(the  presence  of  a  minefield  and  presence  of  false  alarm/clutter).  Also  the  exact  identity 
of  the  process  associated  with  each  event  is  unknown.  In  some  cases  it’s  advantageous  to 
model  such  process  using  the  framework  of  marked  point  process  [Trang  et  ah,  2007]. 
The  current  discussion  is  however  limited  to  simple  point  processes. 

Spatial  patterns  can  be  broadly  classified  as  three  distinct  spatial  distributions: 
random,  aggregated,  and  regular  [Kummamuru,  2002;  Reich,  2007]  as  described  next. 

5.1.1.  Random  Point  Process.  A  point  pattern  is  said  to  be  random  if  the 
presence  and  absence  of  any  other  location  does  not  affect  the  relative  location  of  any 
point.  In  other  words,  each  point’s  location  is  independent  of  the  location  of  any  other 
point.  Checking  the  spatial  randomness  of  the  spatial  point  process  is  often  the  first  step 
in  the  analysis  of  spatial  point  processes.  Reliable  approach  should  be  used  to  quantify 
the  randomness  because  for  some  observers,  random  processes  can  also  appear  to  be 
clustered.  Scale  is  very  important  and  should  be  clearly  indicated  because  purely  random 
processes  may  appear  clustered  at  a  larger  scale. 

5.1.2.  Aggregated  Point  Process.  This  is  the  most  common  type  of  point 
processes  in  real  world.  As  the  name  indicates,  in  the  aggregated  point  process,  the  points 
occur  in  lumps  of  different  densities.  In  this  case,  the  location  of  any  point  is  not 
independent  of  the  location  of  other  points  but  instead  the  occurrence  of  one  point 
actually  favors  the  occurrence  of  other  points  in  its  neighborhood  resulting  in  the  cluster 
formation. 

5.1.3.  Regular  Point  Process.  Point  processes  in  which  the  points  are  evenly 
distributed  over  a  given  area  approximately  forming  vertices  of  regular  shapes  such  as 
lines,  rectangles,  squares  and  triangles  are  called  regular  point  patterns.  In  case  of  regular 
point  process,  occurrence  of  one  point  disfavors  the  occurrence  of  another  point  in  its 
neighborhood.  Both  aggregation  and  regular  point  processes  are  the  extreme  cases  for 
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random  point  processes.  An  example  resulting  in  a  regular  point  process  is  shown  by 
walnut  trees  [Reich,  2007].  The  roots  and  leaves  of  a  walnut  tree  produce  toxic 
substances  that  inhibit  the  growth  of  other  trees  within  the  immediate  vicinity  resulting  in 
a  regular  or  uniform  spatial  pattern.  In  case  of  minefield,  patterned  minefield  form  an 
approximate  regular  point  process. 

5.2.  COMPLETE  SPATIAL  RANDOMNESS 

Complete  Spatial  Randomness  (CSR)  is  the  standard  against  which  spatial  point 
patterns  are  often  compared.  The  hypothesis  of  CSR  for  a  spatial  point  pattern  asserts  that 
[Diggle,  2003]: 

1 .  Number  of  events  in  any  planar  region  A  follows  a  Poisson  distribution  with 
mean  A  \  A  |  where  \A\  is  the  area  of  A  and  A  is  the  intensity  that  does  not  vary 
over  the  region,  and, 

2.  Events  are  equally  likely  to  occur  anywhere  within  area  A  and  no  interactions 
occur  between  the  events  either  repulsively  (regular  point  process)  or 
attractively  (aggregated  point  process). 

This  process  has  the  property  that,  conditional  on  N(A) ,  the  number  of  events  in  a 

bounded  region  A  cz  ,  the  events  of  the  process  are  independently  and  unifonnly 
distributed  over  A,  i.e.,  given  N(A)  =  n,  the  ordered  n  tuple  of  events  (s1  ,...,sn )  in  A" 
satisfies  the  following  identity  [Cressie,1991] 


Pr(N  e  Bl,...,sn  e  Bn )  =  Y[( I  Bt  |  /  |  A  |),  Bl,...,Bn  c=  A, 

i= 1 


(5.3) 


Figure  5.1  shows  the  CSR  process  with  the  total  area  of  the  region  under  study,  \A\ 
=  1000X1000  nr,  and  A  =  0.001/m2. 

5.3.  MEASURES  OF  COMPLETE  SPATIAL  RANDOMNESS 


Various  approaches  have  been  used  to  quantify  various  types  of  spatial  point 
patterns  against  the  CSR  process.  The  most  common  methods  are  the  quadrat  method  and 
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distance  method.  The  quadrat  method  is  based  on  sampling  of  the  area  under 
consideration  using  small  regions  called  quadrats.  A  number  of  statistics  have  been 
proposed  in  the  literature  based  on  these  quadrat  methods.  The  second  type  of  measures 
is  based  on  a  set  of  distance  measurements  made  from  an  event  to  its  k,h  nearest  neighbor 
or  from  a  randomly  selected  sample  point  to  its  k,h  nearest  neighbor. 

5.3.1.  Quadrat  Measures.  This  method  is  one  of  the  very  first  to  be  proposed  to 
measure  the  spatial  randomness  in  point  processes.  It  involves  collecting  counts  of  the 
number  of  events  in  subsets  of  the  study  region  A.  Traditionally  these  subsets  are 
rectangular,  although  any  shape  is  possible.  Quadrats  may  be  placed  either  randomly  or 
laid  out  contiguously  in  A.  The  number  of  events  in  each  quadrat  is  collected,  and  these 
numbers  are  tabulated  as  a  frequency  distribution  and  are  called  quadrat  counts.  Several 
types  of  statistics  are  applied  on  these  quadrat  counts  to  calculate  the  test  statistics.  A 
detailed  description  and  implementation  of  these  statistics  based  on  quadrat  measure  has 
been  investigated  in  [Kummamuru,  2002]  and  more  details  can  be  found  in  [Cressie, 
1991]. 

2 

For  a  CSR  process  shown  in  Figure  5.1,  square  quadrat  (A,-)  of  size  100x100  nT 
are  selected.  Now,  for  each  subset  A/QOOXIOOnr  square)  of  study  region  A,  the  number 
of  events  is  collected.  For  the  CSR  process,  these  events  should  be  samples  from  Poisson 
distribution  with  mean  A  \  (=  10  in  this  case)  where  is  the  area  of  subset  A,-. 

Figure  5.2  shows  the  distribution  of  the  number  of  events  in  each  quadrat  (quadrat 
counts)  and  corresponding  theoretical  (Poisson)  distribution  with  mean  A  At  \ .  As  shown 

in  Figure  5.2,  the  two  distributions  are  in  complete  agreement. 

5.3.2.  Nearest  Neighbor  Distance  Measures.  In  these  methods,  event  to  event  or 
point  to  event  distances  are  computed  and  summarized.  Distances  can  be  calculated 
between  events  and  nearest  neighboring  events  or  between  sample  points  and  nearest 
events.  Sample  points  are  placed  in  the  study  area  randomly  or  systematically.  These 
distances  are  used  as  test  statistics,  and  various  statistical  models  are  being  used  to 
simulate  the  event  to  event  nearest  neighbor  distances. 
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Complete  Spatial  random  Process  with  density  =  0.001/m2 


7 

Figure  5.1.  Complete  Spatial  Random  Process  with  Density  =  0.00 1 /nr 


Distribution  for  number  of  events  in  each  quadrat  of  100x100  m2 


Figure  5.2.  Distribution  of  the  Quadrat  Counts 
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For  the  CSR  process,  the  distribution  for  event  to  event  nearest  neighbor  statistics 
can  be  derived  as  below  [Corcoran,  2004].  Let  A  be  the  mean  number  of  points  per  unit 
area  and  let  D  be  the  distance  between  an  event  and  its  nearest  neighbor,  and  let  x  be  the 
radius  of  a  circle  around  the  event.  The  probability  that  the  nearest  neighbor  of  the  event 
lies  in  the  circle  of  radius  x  is  given  by 

P(D  <  x)  =  1  -  P(D  >  x)  (5.4) 

Because  the  process  is  CSR,  the  probability  of  finding  no  point  within  radius  x  (> 
0)  is  given  by 

P(D>x)  =  e-*m2A0/ 0!  =  e-^2  (5.5) 

Thus  Equation  (5.4)  reduces  to 

Fd (x)  =  P(D  <  x)  =  1  - e'1™2  (5.6) 

where  FD(x)  is  the  cumulative  distribution  function  for  the  nearest  neighbor. 

Thus,  the  probability  distribution  function  for  the  nearest  neighbor  can  be  found 
by  differentiating  equation  (5.5)  with  respect  to  x;  i.e., 

fo  (*)  =  Fd  (x)  =  (1  -  )  =  2Ame^2 ,  x  >  0  (5.7) 

ax  ax 

Equation  (5.7)  represents  the  probability  density  function  for  the  nearest  neighbor 
distance  x.  The  above  derivation  can  be  extended  for  the  kth  nearest  neighbor  also.  The 
PDF  for  the  kth  nearest-  neighbor  distance  x*  is  given  as  [Cressie,  1991] 

fD{xk)  =  2{AK)kxk2k-Xe-^2  /{k-\)\,  xk><d  (5.8) 

where  x*  is  the  k,h  nearest  neighbor  distance  fonn  an  arbitrary  event. 
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Figure  5.3  shows  the  actual  PDF  (blue)  and  theoretical  (Equation  (5.7))  PDF 
(broken  red)  for  first  nearest  neighbor  distances  for  the  CSR  process  shown  in  Figure  5.1. 
As  shown  in  Figure  5.3,  there  is  good  agreement  between  the  actual  and  theoretical  PDF. 
For  an  aggregated  process,  the  actual  PDF  will  be  shifted  toward  the  left  side  because  in 
that  case,  the  nearest  neighbor  distances  will  be  less  as  compared  to  nearest  neighbor 
distances  for  a  CSR  point  process  whose  example  is  shown  in  Figure  5.1.  Also  the 
distribution  is  shifted  toward  the  right  for  a  regular  point  process. 


0  10  20  30  40  50  60 

NN  Distance 

Figure  5.3.  Actual  and  Theoretical  PDF  for  the  First  NN  Distances  for  CSR  Process 


5.4.  SPATIAL  DISTRIBUTION  OF  MINES  AND  FALSE  ALARMS 

This  section  discusses  the  nearest  neighbor  distances  for  the  actual  scattered  and 
patterned  minefields  and  false  alarms.  Corresponding  theoretical  PDF  for  these  are  also 
drawn  and  comparisons  between  actual  and  theoretical  PDF  are  made. 

Figure  5.4  shows  the  distribution  of  mines  and  corresponding  nearest  neighbor 
distances  PDF  for  the  scattered  minefield  for  probability  of  detection  of  100%  and  50%. 
Figure  5.4(a)  shows  the  distribution  of  mines  in  a  typical  scattered  minefield  scenario 
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2  . 

with  density  (0.0127mine/m  )  displayed  in  the  title.  Figure  5.4(b)  shows  the  comparison 
between  the  actual  and  theoretical  (based  on  CSR  assumption)  PDF  for  the  nearest 
neighbor  distance  (NN  Distance)  for  the  same  case.  Figure  5.4(c)  shows  the  comparison 
between  the  actual  and  theoretical  (based  on  CSR  assumption)  PDF  for  the  NN  distance 
when  only  50%  of  mines  are  randomly  selected.  The  actual  and  theoretical  PDF  shows 
good  agreement  for  both  the  cases,  which  would  suggest  that  the  above  scattered 
minefield  can  be  modeled  as  CSR  process  and  random  detection  does  not  affect  the 
spatial  characteristics  of  nearest  neighbor  distances  for  the  minefield. 


nearest  neighbor  distances 


Real  and  Theoritical  PDF  for  NN  distance  for  Scattered  Minefield  (X=  0.0071541  ,PD  =  0.5) 


NN  Distance 


(c)  Actual  and  Theoretical  PDF  for  first  nearest  neighbor  for  PD  =  0.5 


Figure  5.4.  CSR  Scattered  Minefield  and  Corresponding  NN  Distribution 
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For  patterned  minefield,  the  nearest  neighbor  distance  is  fairly  constant  with  small 
variation  due  to  placement  which  can  be  modeled  as  a  Gaussian  distribution.  The 
distribution  of  nearest  neighbor  distances  (x)  can  be  defined  as: 


fD  0)  = 


(5.9) 


where  S  is  the  separation  between  mines  in  a  row  and  oy  is  the  variance  in  this 
separation. 

For  the  random  detection  of  mines  with  probability  of  detection,  p(i,  the 
distribution  of  nearest  neighbor  distances  can  be  derived  as: 


( x-kS)2 

00  1  -f - — 

=  lal  (510) 

k=\  y  2  7T(JS 

Figure  5.5  shows  the  distribution  of  mines  and  corresponding  nearest  neighbor 
PDF  for  the  patterned  minefield  for  PD  of  100%  and  50%.  Figure  5.5(a)  shows  the 
distribution  of  mines  in  a  typical  patterned  minefield  scenario.  Figure  5.5(b)  shows  the 
comparison  between  the  measured  PDF,  theoretical  PDF  (based  on  Equation  5.10)  and 
one  based  on  CSR  assumption  for  the  nearest  neighbor  distance  (NN  Distance)  for  the 
case  of  100%  detection.  Figure  5.5(c)  shows  the  comparison  between  the  measured  PDF, 
theoretical  PDF  and  one  based  on  CSR  assumption  for  the  NN  distance  for  the 
probability  of  mine  detection  of  50%. 

As  shown  in  Figures  5.5(b)  and  5.5(c),  the  actual  PDF  is  regular  since  the  nearest 
neighbor  distances  has  a  prominent  peak  at  the  separation  distance  (4m)  for  100% 
detection  of  mines.  Moreover,  for  random  detection  of  50%  of  mines,  the  peaks  in  the 
PDF  are  likely  to  occur  at  integer  multiple  of  the  separation  between  mines,  which  is 
shown  in  the  Figure  5.5(c). 
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x  i08  Patterned  Minefield  with  density,  X  =  0.0104 

3.6442 

3.6442 

3.6442 

3.6442 

3.6441 

3.6441 

3.6441 

2.3881  2.3882  2.3883  2.3884  2.3885  2.3886  2.3887  2.3888  2.3889 

xIO5 

(a)  Patterned  Minefield 
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(b)  Actual  and  Theoretical  PDF  for  first 
nearest  neighbor  distances 


Real,  CSR,  and  Theoritical  PDF  for  NN  distance  for  Patterned  MF  (X=  0.00545, PD  =  0.5) 


(c)  Actual  and  Theoretical  PDF  for  first  nearest  neighbor  for  PD  =  0.5 
Figure  5.5.  Patterned  Minefield  and  Corresponding  NN  Distribution 


Figure  5.6  shows  the  location  of  false  alarms  detected  by  RX  algorithm  in  one 
background  segment.  As  is  evident  from  the  figure,  the  false  alarms  are  not  distributed 
randomly  but,  form  clusters  (enclosed  in  broken  yellow). 

Figure  5.7  shows  the  distribution  of  NN  distances  based  on  15  background 
segments.  Blue  curve  shows  the  distribution  of  NN  distances  for  actual  false  alarms  and 
red  curve  shows  the  distribution  for  Poisson  CSR  process  with  the  same  false  alarm  rate 
drawn  using  Equation  (5.7).  The  false  alarm  density  is  equal  to  0.01  FA/  m  . 
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Figure  5.6.  Location  of  False  Alarms  in  a  Background  Segment 


NN  distance  pdffori.i.d.  data  with  mine  density  0.01 
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As  shown  from  Figure  5.7,  unlike  the  NN  distances  for  scattered  minefield,  the 
NN  distances  for  the  false  alarm  are  not  distributed  corresponding  to  CSR  process.  The 
false  alarms  are  clustered  as  shown  in  Figure  5.6.  This  can  also  be  interpreted  from  the 
Figure  5.7,  as  the  PDF  for  NN  distances  for  actual  false  alanns  is  also  biased  toward  left 
indicating  more  false  alarms  in  comparatively  small  vicinity,  like  an  aggregated  process. 
Thus  it  becomes  necessary  to  study  other  spatial  distributions  to  model  the  spatial 
locations  for  the  false  alarms.  This  aspect  is  however  not  be  addressed  in  this  thesis  and  is 
left  for  future  work. 
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6.  MINEFIELD  DETECTION  AND  ANALYTICAL  MODELS 


Once  the  mine  level  detection  is  complete,  the  next  step  is  to  detect  the  presence 
of  minefields  using  the  information  from  the  mine  level  results.  Minefield  detection  is 
dependent  on  the  thresholded  mine  level  detection  statistics  derived  in  the  previous 
section.  The  target  locations  identified  after  the  non-max  suppression  and  mine  level 
thresholding  for  each  FoR  are  used  for  detecting  the  minefield  and  eventually  evaluating 
the  minefield  level  confidence  metric.  Various  minefield  level  detection  algorithms  are 
applied  on  the  anomaly  detections  to  derive  a  metric  that  provides  confidence  level  for 
the  presence  or  absence  of  minefields  in  that  area.  As  discussed  in  Section  3,  minefield 
can  be  either  scattered  or  patterned  depending  on  the  position  of  mines  in  the  minefield 
and  the  tactical  scenarios.  A  number  of  minefield  detection  schemes  are  available  in 
literature  to  detect  the  presence  of  patterned  or  scattered  minefields.  Discussions  on  some 
of  these  algorithms  are  presented  in  this  section.  This  section  also  explains  the  analytical 
and  statistical  models  that  are  developed  and  used  to  estimate  the  perfonnance  for  the 
minefield  detection.  Separate  analytical  models  are  developed  for  both  patterned  and 
scattered  minefield  for  validation. 

6.1.  PATTERNED  MINEFIELD 

This  section  discusses  the  detection  algorithm  and  analytical  model  for  patterned 
minefield. 

6.1.1.  Detection  Algorithm.  Pattern  linear  algorithm  is  one  of  the  algorithm  that 
is  used  for  evaluating  the  minefield  level  confidence  values  for  the  patterned  minefield  in 
which  mines  are  arranged  in  rows.  This  algorithm  uses  the  Hough  line  detector  [Carlson 
et  al.,  1994]  to  detect  the  presence  of  mines  in  a  linear  fashion.  The  Hough  line  detector  is 
a  powerful  and  extensive  feature  detection  method  well  suited  for  the  detection  of  lines  in 
the  presence  of  noise.  It  is  used  to  evaluate  the  maximum  number  of  targets  that  fall  on  a 
line.  Hough  transform  projects  each  point  on  a  higher  dimension  parametric  space  of 
lines  given  by: 


P  =  x  cos  6  +  y  sin  6,9  e  {-n  /  2,  n  /  2),  P  e  (-L,  L ) 


(6.1) 
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where  P  represents  the  length  of  normal  from  the  origin  and  6  is  the  orientation  of  P 
with  respect  to  the  X  axis.  Figure  6.1  shows  an  example  for  the  Hough  transform.  Figure 
6.1(a)  shows  the  five  points  in  the  spatial  domain.  Hough  parameter  space  corresponding 
to  five  points  on  the  line  is  also  shown  in  Figure  6.1(b). 


Parametric  representaion  of  line  Hough  parameter  space 


(a)  Five  data  points  in  spatial  domain  and  a  (b)  Hough  parameter  space  showing  five 
line  through  them  sinusoids  corresponding  to  5  data  points 


Figure  6.1.  Hough  Transform  Pair  Representing  the  Data  Points  in  Spatial 
Domain  and  Corresponding  Sinusoids  in  the  Hough  Parameter  Space 


It  can  be  shown  that  Equation.  6. 1  is  equivalent  to 


P  =  yjx2  +  v2  sin  0  +  tan  1 


x 

x 

JC 


(6.2) 


Thus,  mapping  in  this  Hough  parameter  space  ( [P,  6 ] )  results  in  a  sinusoid  with 
an  amplitude  and  phase  dependent  on  the  spatial  coordinates  ( x,y  )  of  the  point,  i.e., 
each  point  in  the  ( x ,  v  )  space  corresponds  to  a  single  sinusoid  in  the  Hough  parameter 
space.  Therefore,  the  set  \P,  6}  that  satisfies  the  above  equation  represents  all  lines  that 
can  pass  through  line  ’  P .’  These  values  of  P  and  0  are  then  collected  into  a  number  of 
bins  called  accumulators  depending  on  the  number  of  sectors  in  which  a  segment  area  is 
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divided.  The  number  of  points  in  each  bin  represents  the  number  of  points  that  are 
collinear  on  the  line  present  at  a  distance  of  P  from  the  origin  and  angle  6*.  The 
accumulator  bins  can  be  further  evaluated  to  identify  targets  on  parallel  lines  by  summing 
targets  along  the  same  6  index  to  accumulate  targets. 

Another  algorithm  which  can  be  used  is  pattern  regular.  This  algorithm  is  the 
same  as  the  pattern  linear  with  an  exception  that  it  enforces  regularity  between  the  targets 
falling  on  a  line  [Lake  et  ah,  1997;  Malloy,  2003].  Thus,  two  detections  that  are  too  close 
together  or  too  far  apart  are  neglected. 

6.1.2.  Analytical  Model.  Analytical  model  for  linear  pattern  for  patterned 
minefield  detection  algorithm  is  described  next  which  is  applicable  for  pattern  linear 
detection  algorithm.  The  analytical  approach  is  based  on  the  theoretical  estimation  of 
minefield  detection  performance  under  certain  assumptions.  Patterned  minefields  have 
mines  arranged  in  some  specific  pattern,  often  linear.  These  can  be  in  the  form  of  a 
number  of  rows  with  some  predefined  spacing  between  the  rows  and  between  individual 
mines  in  each  row.  Let  R  denote  the  number  of  rows  of  mines  in  the  patterned  minefield 
with  M  mines  in  each  row.  The  detection  of  mines  is  assumed  to  be  binomial  so  that  if 
pd  is  the  probability  of  the  detection  of  mines,  then  the  probability  of  detection  of 
n  mines  in  a  row  is  given  by 


P[n  mines  |  M]  =  p/(l-pd)Mn  (6.3) 

n\{M  -ri)\ 

The  false  alarm  detection  is  assumed  to  be  Poisson  with  density  pB .  Thus,  when 
the  minefield  is  not  present,  the  probability  of  detection  in  an  interrogation  area  of 
size  As  is  given  by 

P[k  |  Background  only]  =  e~p“As  (6.4) 

kl 

The  size  of  the  interrogation  area  As  depends  on  the  linearity  of  the  minefield.  In 
a  linear  minefield,  this  area  can  be  a  thin  strip.  When  a  minefield  is  present,  the  total 
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detections  is  given  by  the  sum  of  mines  and  false  alarms.  Thus  the  probability  of  the 
detection  of  k  targets  in  area  As  when  a  minefield  is  present  is  given  by: 


k 

P[k  |  Minefield]  =  '^jP{A  mine  |  M).P(k  -  A  |  Background  only)  (6.5) 

A=0 


In  a  FoR  with  area  A  ,  there  are  0  =  round  \A  /  As  ]  independent  measurements 
from  an  interrogation  area  of  size  As .  Similarly  for  R  rows  of  mines  in  the  FoR,  there  are 

R  independent  measurements  for  mine  statistics.  The  final  minefield  confidence  can  be 
taken  as  the  maximum  of  these  independent  measurements  in  the  FoR.  The  probability  of 
detection  Pd  (k)  and  the  probability  of  false  alann  Pfa  (k)  at  any  threshold  k  in  this  case 

is  given  by 


Pdik)  =  1- 


1  -  ^  P(A  |  Minefield) 


V  a=o 


6  k 


Pfa(k)  =  1- 


1  -  V  P(A  |  Background  only) 

V  /t=o 


The  false  alarm  rate  at  threshold  Lk'  can  be  written  as 


1 06 

FAR  (k)  =  P  (k) - FA/km2 


A 


(6.6) 


(6.7) 


(6.8) 


Pd(k)  and  FAR(k )  in  Equation  (6.6)  and  (6.7)  can  be  used  to  draw  the  ROC  curve  for 
analytical  performance. 

Figure  6.2  shows  an  analytical  ROC  curve  for  patterned  minefield  with  three  rows 
of  mines  and  different  distance  between  mines.  Distance  of  4m,  5m  and  6m  is  used 
between  mines.  For  a  minefield  FAR  of  0.5  FA/km  ,  corresponding  PD  are  0.95,  0.72  and 
0.57  for  case  of  4m,  5m,  and  6m  of  separation,  respectively.  As  can  be  seen  from  the 
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figure,  minefield  performance  improves  as  the  separation  between  mines  decreases.  This 
is  expected  since  the  number  of  mines  in  the  minefield  increases  as  the  separation 
between  mines  is  reduced. 


Minefield  level  ROC  Curve  for  Patterned  Minefield  with  Different  Distance  between  Mines 


False  Alarm  Rate  (FA/km2) 

Figure  6.2.  Analytical  ROC  Curve  for  Patterned  Minefield  with  Different  Distance 

between  Mines 


Alternatively,  R  rows  of  detection  (mines)  can  be  scored  in  which  case  probability 
of  the  detection  of  mines  and  of  false  alarms  is  given  by 


PR  [//  mine  |  Minefield]  = 


RM 

n 


(6.9) 


PR  [k  |  Background  only]  =  ^  e  p«RAs 

k\ 


(6.10) 
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The  probability  of  detection  Pd  ( k )  and  the  probability  of  false  alann  Pfa  ( k )  at 
any  threshold  k  is  given  by 


pd  (k)  =  YjPr(A  |  Minefield) 


2=0 

f  k 


Pfa(k)  =  1- 


1  -  V  PR  (A  Background  only) 

V  i=o 


(S/R) 


(6.11) 

(6.12) 


FAR(k)  can  be  obtained  as  defined  in  Equation  (6.8)  and  Pd  (k)  and  FAR(k)  can  be  used  to 
draw  the  analytical  ROC  curves. 

Figure  6.3  shows  an  analytical  ROC  curve  for  patterned  minefield  with  different 
number  of  rows  in  the  minefield.  Distance  between  the  mines  is  kept  constant  and  is 
equal  to  5m.  For  a  minefield  FAR  of  0.5  FA/km  ,  corresponding  PD  of  0.35,  0.85  and 
0.98  is  obtained  for  1,  2,  and  3  rows  of  mine.  The  minefield  performance  improves 
significantly  as  the  number  of  rows  is  increased.  This  is  expected  since  more  number  of 
rows  of  mines  in  a  minefield  increases  the  number  of  mines  available  for  minefield 
detection. 

6.2.  SCATTERED  MINEFIELD 

Various  minefield  level  detection  algorithms  and  analytical  model  for  scattered 
minefield  are  discussed  below. 

6.2.1.  Detection  Algorithm.  Scatter  number  is  the  basic  algorithm  for  evaluating 
the  minefield  confidence  value  for  each  FoR  for  a  scattered  minefield.  In  this  algorithm, 
the  number  of  detections  from  the  interrogation  patch  is  taken  as  the  confidence  value  for 
the  minefield.  Thus,  confidence  value  ASN  for  a  FoR  is  given  as 

4v  =  N{D)  (6.13) 

where  N{D}  represent  the  number  of  detections  obtained  from  the  FoR.  The  statistics  are 
applicable  to  the  CFAR  type  of  thresholding  scheme  for  target  detection. 
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Minefield  level  ROC  Curve  for  Patterned  Minefield  with  Different  Number  of  Rows 


Figure  6.3.  Analytical  ROC  Curve  for  Patterned  Minefield  with  Different  Number  of 

Rows  of  Mine 


The  test  statistics  in  Equation  (6.13)  can  be  shown  to  be  optimal  when  both  the 
mines  and  false  alarms  follow  a  CSR  distribution  and  no  other  information  is  available  to 
differentiate  mines  from  the  false  alarms  [Earp,  2000b].  This  detection  scheme  is  quite 
popular  and  used  due  to  quick  analysis  of  scattered  minefield. 

Scattered  log  weighted  is  another  algorithm  which  is  similar  to  the  scattered 
number  except  that  the  decision  statistics  are  defined  as  the  average  of  the  log  of  the 
detection  statistics  for  all  valid  detection  as  proposed  by  Earp  [Earp  et  ah,  1995].  A 
minefield  test  statistics  obtained  based  on  likelihood  ratio  is  defined  as: 

1  N 

■^minefield  ^ -  !>/(**)  (6.14) 

A  k= l 

where  Ll(x)  is  the  log-likelihood  ratio  for  the  AD  values  of  x,  and  N  is  the  number  of 
targets  in  the  FoR. 
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Using  empirical  result  Earp  [Earp,  1995]  proposed  a  log  function  in  place  of 
LI(x )  so  that  the  test  statistics  is  defined  as: 

1  N 

■^minefield  ^  j  ^log(x,)  (6.15) 

W  k= 1 

This  statistics  is  applicable  to  constant  target  rate  (CTR)  as  well  as  to  constant 
false  alarm  rate  (CFAR)  type  target  detection.  This  value  represents  the  minefield 
confidence  value  for  that  segment. 

6.2.2.  Analytical  Model.  For  a  scattered  minefield,  it  is  assumed  that  the  mines 
are  laid  randomly  over  an  area.  Because  the  deployment  of  mines  within  the  minefield  is 
random,  a  suitable  model  for  the  distribution  can  be  assumed  to  be  a  “completely 
random”  Poisson  point  process  [Earp,  2000a  and  2000b].  In  this  case,  the  probability  of 
encountering  a  mine  in  an  area  is  independent  of  the  probability  of  encountering  a  mine 
in  any  other  area.  The  false  detections  are  also  assumed  to  be  Poisson  with  a  different 
intensity  parameter  pB .  The  probability  of  getting  k  false  alann  detections  from  the 
background  in  an  area  A  of  the  FoR  can  be  modeled  as 

P[k  |  Background  only]  =  e~p«A  (6.16) 

k\ 

Let  pT  be  the  density  of  mines  (in  mines  per  unit  area)  in  the  minefield  and  pd 

be  the  rate  of  detections.  Then  the  probability  of  getting  k  mine  detections  in  an  area  A 
is  given  by 


P[k  |  MineField]  =  (p^ptA  +  PbA)  e-PdPTA,PHA 


(6.17) 


representing  Poisson  detection  of  both  the  mines  and  false  alarms. 
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The  probability  of  detection/^  (k) ,  probability  of  false  alarm  Pfa  ( k )  ,  and  FAR  at 
any  threshold  k  is  given  by 


Pi(k)  =  YjP[y\  MineField]  =  1  -  eHPdPTA+p°A)  ^  {PdPjA  +  PbAV  (6.18) 

y=k  y=0  /■ 

Pfa  (^)  =  X  I  Background  only]  =  1  -  e  P"A  ^ -  (6.19) 

r=k  y= o  /■ 

1 06 

FAR  (k)  =  Pfa  (k) - FA/km2  (6.20) 

A 


Pd  (k  )  and  FAR  (k)  can  be  used  to  draw  the  analytical  ROC  curves.  Note  that  an 

implicit  assumption  in  the  model  is  that  the  size  of  FoR  is  smaller  than  the  minefield  and 
that  the  FoR  is  fully  over  the  minefield.  Figure  6.4  shows  an  analytical  ROC  curve  for 
scattered  minefield  with  different  mine  densities. 


Minefield  level  ROC  Cun/e  for  Scattered  Minefield  with  different  Mine  Density 


Figure  6.4.  Analytical  ROC  Curve  for  Scattered  Minefield  with  Different  Mine  Density 
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Mine  level  PD  at  the  FAR  of  0.001  FA/itT  is  0.34.  Mine  density  of  0.002 

2  *2  *2 

mines/nr,  0.004  mines/nr  and  0.006  mines/nr  are  used.  Minefield  level  PD  values  of 
0.145,  0.567  and  0.882  are  obtained  corresponding  to  false  alann  rate  of  0.5  FA/knr.  PD 
and  FAR  are  calculated  using  Equation  6.18  and  6.20,  respectively.  The  minefield 
performance  improves  as  the  mine  density  increases. 
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7.  SIMULATION  RESULTS 


This  section  presents  mine  and  minefield  detection  results  for  the  simulated 
parameters.  Several  variables  are  used  for  generating  the  simulated  data,  which  are 
tabulated  in  Table  7.1.  Results  are  generated  for  both  patterned  and  scattered  minefields 
with  different  parameters.  Parameters  whose  effects  are  considered  include  signal  to 
clutter  ratio  (SCR),  constant  false  alarm  rate  (CFAR),  swath  width,  holidays,  and  segment 
overlap.  Each  of  these  parameters  is  discussed  in  the  following  subsections. 


Table  7.1.  Simulation  Parameters  for  Scattered  and  Patterned  Minefield  Data 


Scattered 

Patterned 

No.  of  runs  simulated 

750 

750 

Flight  Speed 

75 

75 

knot 

Altitude 

2050 

2050 

feet 

No.  of  rows,  columns 

512,  640 

512,  640 

pixel 

No.  of  swath,  steps 

7,  variable  (1-5) 

7,  variable  (1-5) 

Swath  width,  FoR 
length 

60-290 

60  -  290 

feet 

451-455 

451-455 

feet 

Bkg.  Anomaly 
distribution 

Central  F-  distribution 
(J  =  4,  Nc=  50) 

Central  F-  distribution 
(J  =  4,  Nc  =  50) 

Bkg.  spatial 
distribution 

Poisson  with  density  of 
0.0 1/m2 

Poisson  with  density  of  0.0 1/m2 

Mine  Anomaly 
distribution 

Non  Central  F- 
distribution  (J  =  4, 

Nc  =  50,  SCR  =  0.5) 

Non  Central  F-  distribution 
(J  =  4,  Nc  =  50,  SCR  =  0.125) 

Mine  spatial 
distribution 

Poisson  with  density  of 
0.004/m2 

Linear  (distance  of  9  meters 
between  mines.  Three  rows  with 
distance  of  20m  between  rows) 

Mine  type 

SM  A  (Small  metal) 

LM  A  B  (Large  metal  Buried) 

Mine  size  (No.  of 
Pixels) 

6(13) 

12  (50) 

inch 

Mine  level  algorithm 

CFAR  with  FAR  of 
0.001 

CFAR  with  FAR  of  0.001 

FA/m2 

Minefield  Algorithm 

Scatter  Number 

Pattern  Linear 
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7.1.  EFFECT  OF  SCR 

The  SCR  for  the  RX  detection  is  defined  as  in  Equation  (3.6) 

SCR=£T  =  fltf /jo2  (7.1) 

where  //  is  the  mean  target  signature,  and  a  is  the  standard  deviation  of  the  background 
area.  Figure  7.1  shows  the  mine  level  and  corresponding  minefield  level  ROC  curves  for 
different  SCR  values.  Figure  7.1(a)  shows  the  mine  level  ROC  curve  for  a  scatterable 
minefield  and  Figure  7.1(b)  shows  the  corresponding  minefield  level  ROC  curves  for 
SCR  values  of  0.4  (red),  0.5  (blue),  and  0.6  (green)  for  the  scatterable  minefield  with 
three  steps  (swath  width  of  175  feet).  The  corresponding  GSNR  values  are  20.8,  26  and 
3 1 .2,  respectively. 


False  Alarm  Rate  (FA/m2) 

(a)  Mine  level  ROC  curve 


False  Alarm  Rate  (FA/km2) 

(b)  Minefield  level  ROC  curve 


Figure  7.1.  Mine  Level  ROC  Curve  and  Corresponding  Minefield  Level  ROC  Curve  for 

Different  SCR  Values 


Minefield  level  ROC  Curve 


Mine  Level  ROC  curve 


As  shown  from  Figure  7.1,  the  mine  level  as  well  as  minefield  level  performance 
increase  with  an  increase  in  the  SCR  value.  This  is  expected  because  with  an  increase  in 
the  SCR  values,  contrast  difference  between  the  mine  targets  and  background  clutter 
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increases,  which  facilitates  the  detection  of  mines.  The  same  performance  is  expected  for 
the  patterned  minefield.  The  minelevel  PDs  for  the  SCR  values  at  the  FAR  of 
O.OOlFA/m-  are  0.19,  0.34,  and  0.49,  respectively,  as  indicated  by  the  'x'  mark  in  Figure 
7.1(a).  Similarly,  minefield  PDs  at  an  FAR  of  0.5  FA/km2  are  0.30,  0.68,  and  0.93, 
respectively. 

7.2.  EFFECT  OF  CFAR  VAFUE 

In  this  type  of  mine  level  thresholding,  the  false  alarm  rate  per  square  meter  ( pB  ) 
is  specified.  At  any  selected  value  of pB ,  the  expected  number  of  false  alarms  in  FoR  is 
constant.  Figure  7.2  shows  the  effect  of  the  CFAR  value  on  the  minefield  level 
performance.  CFAR  values  of  0.00005  (red),  0.001  (blue),  0.004  (magenta),  and  0.008 
(black)  are  used  with  scattered  minefields  with  three  steps.  As  shown  from  Figure  7.2(b), 
as  the  FAR  value  increases,  the  respective  minefield  perfonnance  decreases.  This  is 
because  the  density  of  mines  is  very  small  as  compared  with  the  density  of  background 
anomalies  (0.004/m  of  mines  in  contrast  with  0.01/m  of  background  anomalies).  As 
CFAR  value  increases,  the  number  of  detected  mines  in  FoR  also  increases  due  to 
increase  in  PD.  However,  the  increase  is  lower  as  compared  to  the  increase  in  false  alarm. 
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Mine  Level  ROC  curve 
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Minefield  level  ROC  Curve 
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10'2  10'2  10'1  10°  101 
False  Alarm  Rate  (FA/km2) 

(b)  Minefield  level  ROC  curve 


False  Alarm  Rate  (FA/rrr) 


(a)  Mine  level  ROC  curve 


Figure  7.2.  Mine  Level  ROC  Curve  and  Corresponding  Minefield  Level  ROC  Curve  for 
Different  FAR  Values  for  Scattarable  Minefield 
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7.3.  EFFECT  OF  SWATH  WIDTH 

Figures  7.3  and  7.4  show  the  simulation  results  for  a  swath  width  of  1  and  5 
(steps)  for  patterned  and  scattered  minefields,  respectively.  Magenta  squares  and  red 
diamonds  represent  the  mines  in  respective  figures.  All  of  the  segments  are  represented 
by  different  color  fills. 


X(m) 


Figure  7.3  Simulated  Minefield  Layout  for  a  Patterned  Minefield  with  Swath  Width  of 

One  Step 


Figure  7.5  shows  the  effect  of  the  swath  width  on  the  minefield  level  perfonnance 
for  a  patterned  minefield  with  three  rows  of  mine  and  20%  side  step  overlap,  and  Figure 
7.6  shows  the  effect  on  a  scattered  minefield.  The  number  of  steps  of  1  (red),  2  (blue),  3 
(green),  4  (magenta),  and  5  (black)  corresponding  to  a  swath  width  of  63  ft,  1 19  ft,  175  ft, 
232  ft,  and  288  ft,  respectively,  are  used  for  the  results.  The  mine  level  FAR  is  chosen  to 
be  0.001  FA/m2  for  both  the  scattered  and  patterned  minefields  resulting  in  a 
corresponding  PD  of  0.33  for  the  scattered  minefield  and  0.5 1  for  the  patterned  minefield. 


Probability  of  Detection  i_j  Y  (m) 
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hgure  7.4. 


Simulated  Minefield  Layout  for  a  Scattered  Minefield  with  Swath  Width  of 

Five  Steps 


False  Alarm  Rate  (FA/km  ) 


Minefield  level  ROC  Curve 


10'2  10'1  10°  101 
False  Alarm  Rate  (FA/km2) 

(b)  Analytical  minefield  performance 


Minefield  level  ROC  Curve 


(a)  Simulated  minefield  performance 
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Figure  7.5.  Simulated  and  Analytical  Minefield  Level  ROC  Curves  for  the  Patterned 

Minefield  for  Different  Swath  Widths 


As  shown  from  Figures  7.5  and  7.6,  a  good  agreement  exists  between  the 
simulated  and  analytical  results.  Also,  for  both  the  scattered  and  patterned  minefields, 
performance  increases  with  an  increasing  swath  width.  This  is  expected  as  the  minefield 
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front  is  wider  than  the  maximum  swath  width  of  288  feet  (88  meters).  A  bigger  swath 
width  results  in  more  mine  targets  in  the  FoR,  which  results  in  more  reliable  detections. 
However,  increasing  the  swath  width  beyond  the  size  of  the  minefield  front  may  actually 
result  in  lower  detections.  This  may  happen  for  smaller  tactical  scattered  minefields. 


Minefield  level  ROC  Curve 


(a)  Simulated  minefield  performance 


—  Scattered  1  Step.  Ana 
■  —  Scattered  2  Step,  Ana 
•■■■Scattered  3  Step.  Ana 
"  •  •  Scattered  4  Step.  Ana 

—  Scattered  5  Step.  Ana 


10  10 
False  Alarm  Rate  (FA/km2) 


Minefield  level  ROC  Curve 


(b)  Analytical  minefield  performance 


Figure  7.6.  Mine  Level  ROC  Curve  and  Corresponding  Minefield  Level  ROC  Curve  for 
Scattered  Minefields  with  Different  Swath  Widths 


7.4.  EFFECT  OF  HOLIDAYS 

Data  are  collected  in  the  form  of  frames  of  images.  These  frames  are  then  stitched 
together  to  form  an  FoR/segment.  This  transformation  of  image  frames  in  a  single 
coordinate  system  is  called  image  registration  and  plays  a  significant  role  in  the  mine  and 
minefield  level  performance  evaluation.  However,  for  accurate  image  registration,  it  is 
necessary  for  some  portions  of  the  consecutive  frames  to  overlap  in  order  to  provide  the 
control  points  for  the  frame  to  frame  registration.  Sufficient  overlap  should  be  present  in 
both  the  in-flight  and  across-flight  direction  to  obtain  an  accurate  and  undistorted  mosaic 
of  frames.  If  the  overlap  among  frames  is  insufficient,  then  holidays  will  exist  between 
the  frames.  Figures  7.7  and  7.8  show  the  simulation  results  for  the  side-step  overlap  for  - 
20%  (holiday)  and  20%  respectively.  As  seen  from  Figure  7.7,  a  negative  side-step 
overlap  is  quite  visible  resulting  in  the  absence  of  control  points  in  the  in-flight  direction. 
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Figure  7.7.  Simulated  Minefield  Layout  for  a  Patterned  Minefield  with  -20%  (Holiday) 

Side-Step  Overlap 


Figure  7.8.  Simulated  Minefield  Layout  for  a  Scattered  Minefield  with  20%  Side-Step 

Overlap 


Although  the  presence  of  holidays  allows  for  more  area  to  be  covered  under  each 
run,  the  absence  of  control  points  results  in  poor  alignment  of  detection  and  eventually  in 
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poor  minefield  performance.  This  is  especially  true  for  patterned  minefields  due  to  the 
presence  of  a  linear  pattern.  Figure  7.9  shows  the  effect  of  side-step  overlap  on  the 
minefield  level  perfonnance  for  a  patterned  minefield  with  three  rows  of  mines  and 
scattered  minefields.  Figure  7.10  shows  the  corresponding  analytical  results.  Side-step 
overlap  of  -20%  (holiday,  red),  0%  (blue),  and  20%  (green)  have  been  used  for  the 
minefield  level  detection  with  no  registration  error.  As  shown  from  Figure  7.9  and  7.10, 
as  the  side-step  overlap  decreases,  the  minefield  performance  increases.  This  can  be 
expected  because  with  a  decrease  in  the  overlap,  the  effective  swath  width  of  the  FoR 
increases,  which  results  in  more  targets  to  be  permitted  for  the  performance  evaluation 
for  the  given  mine  level  FAR.  With  an  increase  in  the  number  of  allowable  targets  per 
segment,  the  minefield  confidence  value  increases  for  the  segment,  resulting  in  better 
minefield  performance. 


Minefield  level  ROC  Curve  Minefield  level  ROC  Curve 


Figure  7.9.  Simulated  Minefield  Level  ROC  Curves  for  Patterned  and  Scattered 
Minefields  with  Different  Side-Step  Overlap 


Figure  7.11  shows  the  minefield  performance  when  some  amount  of  registration 
error  (5m)  is  present.  In  this  case,  the  minefield  performance  for  negative  overlap  case  is 
in  fact  poorer  than  positive  overlap.  This  is  because  for  the  case  of  negative  overlap 
(holiday)  or  no  overlap,  the  image  frames  are  not  stitched  in  any  trustworthy  fashion 
along  the  steps.  Thus,  pattern  detections  suffer  because  rows  or  targets  that  were  actually 


Probability  of  Detection  Probability  of  Detection 
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in  a  linear  pattern  do  not  appear  to  be  so.  This  effect  on  scattered  minefield  detection  is 
less  pronounced  as  the  detection  statistics  are  based  on  count  and  not  their  distribution. 
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(a)  Patterned  minefield  level  ROC  curve  (b)  Scattered  minefield  level  ROC  curve 

Figure  7.10.  Analytical  Minefield  Level  ROC  Curves  for  the  Patterned  and  Scattered 
Minefields  for  Different  Side-Step  Overlaps 
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Figure  7.11.  Simulated  Minefield  ROC  Curves  for  Patterned  and  Scattered  Minefields  for 
Different  Side-Step  Overlaps  with  a  Registration  Error  of  Five  Meter 
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7.5.  EFFECT  OF  SEGMENT  OVERLAP 

Once  the  inter-segment  (frame  to  frame)  registration  is  complete,  the  intra¬ 
segment  registration  must  be  done  so  as  to  reconstruct  the  complete  run.  Just  like  frame 
to  frame  registration,  the  segment  to  segment  registration  is  also  dependent  on  the  amount 
of  overlap  available  between  the  consecutive  segments.  A  segment  overlap  of  zero 
implies  that  each  FoR  is  disjointed  from  the  others,  and  an  overlap  of  six  implies  that 
every  new  captured  swath  and  the  previous  six  swaths  are  used  for  minefield  evaluation. 

Examples  for  the  segment  overlap  of  two  and  segment  overlap  of  four  are  shown 
in  Figures  7.12  and  7.13,  respectively,  for  a  scattered  and  patterned  minefields, 
respectively.  Figure  7.14  shows  the  effect  of  segment  overlap  on  the  minefield  level 
performance  for  a  patterned  minefield  with  three  rows  of  mine  and  scattered  minefield, 
respectively.  Segment  overlaps  of  zero  (red)  and  six  (blue)  segments  have  been  used  for 
the  minefield  level  ROC  curve. 

As  shown  in  Figure  7.14,  the  minefield  performance  improves  slightly  for 
overlapping  FoRs.  This  is  because  with  the  no  overlap  case,  a  minefield  may  lie  partially 
in  each  of  two  consecutive  FoRs,  which  implies  poor  detection  in  both  FoRs,  and  hence, 
poor  performance. 


Figure  7. 12.  Simulated  Minefield  Layout  for  a  Scattered  Minefield  with  a  Segment 

Overlap  of  Two  Segments 
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Figure  7.13.  Simulated  Minefield  Layout  for  a  Patterned  Minefield  with  a  Segment 

Overlap  of  Four  Segments 


Minefield  level  ROC  Curve  Minefield  level  ROC  Curve 


Figure  7.14.  Simulated  Minefield  Level  ROC  Curves  for  Patterned  and  Scattered 
Minefields  for  Different  Segment  Overlaps 
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8.  CONCLUSION  AND  FUTURE  WORK 


A  typical  airborne  minefield  detection  system  is  modeled  and  mine  and  minefield 
level  perfonnance  are  evaluated  based  on  simulated  data  under  different  data  collection 
scenarios.  The  simulation  system  created  to  synthesize  representative  data  under  different 
scenarios  of  interest  is  discussed.  The  parameters  that  drive  the  system’s  performance  are 
identified,  and  the  simulated  results  are  evaluated  and  discussed.  The  effects  of  different 
thresholding  schemes  used  for  thresholding  the  anomaly  statistics  on  the  airborne  data  are 
discussed.  CFAR  seems  to  be  the  most  effective  technique  for  threshold  selection 
because  the  threshold  is  selected  adaptively  depending  on  the  background  statistics.  A 
fixed  threshold  and  constant  target  rate  suffer  with  potential  limitation  of  poor 
performance  due  to  non-homogeneity  of  the  background.  Non-homogeneity  of  the 
background  has  however  not  been  modeled  in  the  simulation  tool. 

Central  F  distribution  is  successfully  used  for  modeling  the  RX  detection 
statistics.  The  parameters  for  modified  central  F-distribution  are  obtained  using  EM 
algorithm  and  the  results  for  modeling  different  combinations  of  MSI  bands,  and 
different  target  radii  are  shown.  Eight  different  data  sets  are  created  depending  on  the 
type  of  background,  time,  and  for  background  only  and  background  with  minefield  for 
which  the  modeling  performs  quite  well.  The  modeling  results  seem  to  be  good  especially 
for  target  radii  of  1  and  2.  The  results  for  the  homogeneous  or  approximately 
homogeneous  background  for  a  complete  FoR  excel,  whereas  for  mixture  of  backgrounds 
the  modeling  is  generally  poor  indicating  likely  multimodal  distribution.  Modeling 
performance  and  fit  are  shown  using  both  PDF  and  inverse  CDF. 

Various  spatial  distribution  techniques  used  for  modeling  the  spatial  locations  of 
the  false  alarms  are  discussed.  However,  a  satisfactory  match  between  the  actual  spatial 
locations  and  simulated  spatial  locations  for  false  alarms  must  be  explored  further 
because  actual  spatial  locations  are  clustered  and  do  not  follow  Poisson  distribution. 
Analytical  models  for  both  scattered  and  patterned  minefields  are  effectively  derived  and 
implemented  in  the  simulation  system.  They  present  a  good  agreement  with  the  proposed 
minefield  detection  algorithms.  Simulation  results  for  a  range  of  different  parameters  and 
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their  effect  are  discussed  in  Section  7.  The  simulated  results  and  analytical  results  are 
found  to  be  in  good  agreement. 

The  present  work  has  laid  the  foundation  for  a  simulation-based  system  capable  of 
evaluating  the  performance  of  airborne  mine  and  minefield  detection  structures.  In  the 
future,  this  research  can  be  extended  on  a  number  of  grounds.  IR  data  can  also  be 
modeled  in  a  similar  manner  as  the  MSI  data  using  the  EM  algorithm.  Other  anomaly- 
detection  techniques,  especially  FAM  techniques,  can  also  be  modeled.  It  is  also  useful  to 
model  the  spatial  distribution  for  the  actual  locations  of  the  false  alarms  and  then 
incorporate  those  values  into  the  simulation  system.  This  will  help  to  complete  the 
modeling  tool  that  is  capable  of  modeling  not  only  anomaly  values  but  also  spatial 
locations  in  accordance  with  actual  data.  The  modeling  tool  can  also  be  improved  to 
include  multiple  minefields  per  run.  Finally,  it  is  expected  to  compare  the  perfonnance  of 
the  simulated  data  with  actual  airborne  collected  data. 


APPENDIX  A. 

SPECTRAL  VEGETATION  INDICES 
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This  section  explains  the  indices  that  are  used  to  classify  the  frames  as  vegetation 
or  non-vegetation.  Several  Spectral  Vegetation  Indices  (SVIs)  are  available  in  the 
literature  that  provides  a  measure  of  live,  green  vegetation  in  an  area.  These  indices  are 
designed  to  enhance  the  vegetation  signal  in  remotely  sensed  data.  Most  of  these  indices 
use  a  combination  of  data  from  various  spectral  bands  (red,  blue,  green,  and  near  band 
IR)  into  a  single  value.  The  idea  of  SVIs  is  based  on  the  fact  that  the  spectral  response  of 
green  leaves  exhibits  a  jump  in  the  reflectance  in  the  near  infrared  NIR  portion  (700  - 
1350  mn)  due  to  the  dominant  plant  pigment,  chlorophyll.  This  response  is  often  called 
an  IR  rise  [Emch,  2001],  Figure  A.l  shows  the  spectral  reflectance  curve  for  dirt  (brown), 
rock  (black),  and  vegetation  (green).  The  IR  rise  phenomenon  for  the  green  vegetation  is 
quite  visible  from  the  figure.  As  seen  in  figure,  the  plants  also  exhibit  pronounced 
absorbance  of  the  bluish  (400  -  500  mn)  and  reddish  (600  -  700  mn)  wavelengths  thus 
appearing  green.  This  absorption  of  red  or  blue  bands  along  with  IR  rise  can  be  exploited 
to  provide  a  unique  index  for  the  vegetation  cover.  Different  SVIs  are  discussed  in  the 
following  Subsections. 


Spectral  Reflectance  curves  for  Dirt,  Rock  and  Vegetation 


Figure  A.  1 .  Spectral  Reflectance  Curves  for  Different  Land  Forms 
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A.l.  Simple  Ratio  or  Ratio  Vegetation  Index  (SR  or  RVI) 

One  of  the  simple  SVI  metric  is  a  simple  ratio  between  NIR  and  red  bands.  Thus, 

SR  or  RVI  =  NIR/RED  (A.  1) 

SR  values  for  the  bare  soil  are  generally  near  1  because  for  the  bare  soil,  the  Red 
and  NIR  bands  have  similar  reflectance.  For  the  live  vegetation,  the  SR  increases.  The  SR 
values  are  unbounded  and  can  range  from  0  to  infinity.  Figure  A. 2  shows  an  example  for 
SR  for  a  dense  vegetation  frame.  The  red  and  NIR  bands  appears  in  the  top  part  of  the 
figure.  The  PDF  of  the  SR  values  is  also  shown  in  Figure  A.2.  On  the  SR  image,  red  dots 
indicate  the  pixels  having  an  SR  value  greater  than  3  (representing  high  likelihood  of 
vegetation).  From  the  figure,  it  can  be  noted  that  the  SR  is  quite  effective  in  detecting  the 
presence  of  live  vegetation.  The  red  points  fall  approximately  on  top  of  the  vegetation 
areas.  The  SR  values  are  not  estimated  for  edge  pixel  of  20  pixels  width. 


Red  Band 


NIR  Band 


SR  Image  -  Frame  2 


Figure  A. 2.  Red  Band,  NIR  Band,  Simple  Ratio  (SR)  Image  and  Histogram  for  the 
Vegetation  Frame.  The  Red  Dots  Indicate  the  Pixels  Having  an  SR  Value  Greater  than  3 
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A.2.  Normalized  Difference  Vegetation  Index  (NDVI) 

NDVI  is  defined  as: 


NDVI  =  (NIR  -  RED)/  (NIR  +  RED)  (A.2) 

As  shown  from  Equation  (A.2),  the  difference  between  the  NIR  and  RED  bands  is 
divided  by  their  sum.  This  normalization  is  used  to  minimize  the  effect  of  variable 
irradiance  levels.  NDVI  is  always  bounded  between  -1  and  1.  A  higher  positive  value  of 
NDVI  indicates  the  presence  of  green  vegetation,  whereas  a  value  close  to  0  indicates  a 
non-vegetation  background  or  dead  vegetation.  Free-standing  water  (e.g.,  oceans,  seas, 
lakes,  and  rivers)  which  has  low  reflectance  in  both  NIR  as  well  as  visible  bands,  results 
in  very  low  positive  or  slightly  negative  NDVI  values,  whereas  clouds  and  snowfields 
exhibit  negative  values  for  this  index. 

Figure  A. 3  shows  the  individual  red  and  NIR  bands  along  with  the  NDVI  image 
and  the  PDF  for  the  NDVI  values  for  a  dirt  frame.  Threshold  of  0.4  is  used  to  classify  the 
live  vegetation  with  other  areas.  As  shown  from  Figure  A. 3,  the  histogram  for  the  dirt 
only  frame  is  symmetric  and  has  the  peak  at  0.26.  Moreover,  for  the  dirt  frame,  no  pixel 
is  chosen  for  an  NDVI  value  greater  than  0.4.  Figure  A.4  shows  the  individual  red  and 
NIR  bands  along  with  the  NDVI  image  and  the  PDF  for  the  NDVI  values  for  a  vegetation 
frame.  In  this  case,  for  a  threshold  of  0.4,  most  of  the  vegetative  region  is  correctly 
identified  as  vegetation.  This  indicates  that  NDVI  can  act  as  a  good  classifier  to 
differentiate  between  vegetation  and  non-vegetation  features. 

In  Figure  A.4,  some  regions  with  shadow  are  also  detected  as  vegetation  along 
with  the  actual  vegetation.  This  suggests  that  although  NDVI  is  a  good  classifier  for  the 
vegetation,  some  false  alanns  can  be  produced  due  to  the  presence  of  shadows  or  the 
misalignment  of  bands. 
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Red  Band  NIR  Band 


Figure  A. 3.  Red  Band,  NIR  Band,  NDVI  Image  and  Histogram  for  the  Dirt  Only  Frame 


Red  Band  NIR  Band 


Figure  A.4.  Red  Band,  NIR  Band,  NDVI  Image  and  Histogram  for  the  Vegetation  Frame. 
The  Red  Dots  Indicate  the  Pixels  Having  an  NDVI  Value  Greater  than  0.4 
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A.3.  Normalized  Difference  Vegetation  Index  with  Blue  (NDVIB) 

Because  both  the  red  and  blue  colors  are  absorbed  by  chlorophyll  in  live  green 
vegetation,  the  blue  band  can  also  be  used  to  calculate  the  NDVI.  Thus,  in  that  case 
NDVIb  is  defined  as 


NDVIg  =  (NIR  -  BLUE)/  (NIR  +  BLUE)  (A.3) 

Figure  A. 5  shows  an  example  for  a  blue  band,  NIR  band,  NDVIb  image,  and 
NDVIb  histogram  using  a  blue  band  instead  of  a  red  band  for  the  same  frame  used  for 
NDVI  demonstration.  Also,  an  NDVIb  value  of  0.6  has  been  used  to  threshold  the  pixels 
as  compared  with  a  threshold  of  0.4  used  in  NDVI  thresholding  in  Figure  A. 4. 
Comparing  Figure  A.4  and  A. 5,  it  can  be  seen  that  both  metrics  are  effective  for 
classification  of  vegetation.  Most  importantly,  in  the  case  with  blue  band,  the  shadow  is 
considerably  lower  in  the  NDVIb  thresholded  image. 


Blue  Band 


NIR  Band 


NDVI0  Image  -  Frame  13 


PDF  for  NDVIg  values 


Figure  A. 5.  Blue  Band,  NIR  Band,  NDVIb  Image  and  Histogram  for  the  Vegetation 
Frame.  The  Red  Dots  Indicate  the  Pixels  Having  NDVIb  Value  Greater  than  0.6. 
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Figure  A. 6  shows  the  combined  NDVI  and  NDVIB  values  along  with  red,  blue, 
and  NIR  bands.  It  clearly  shows  that  in  the  NDVIb  image,  the  shadows  are  not  detected 
as  vegetation.  The  vegetation  and  non-vegetation  areas  in  the  frame  are  also 
differentiated  even  if  they  are  in  the  shadow  as  shown  in  Figure  A. 6.  The  dirt  road  under 
the  shadow  (encircled  in  broken  cyan)  has  lower  (appear  dark)  NDVIb  values.  However, 
vegetation  under  the  shadow  (encircled  in  broken  yellow)  has  higher  (appear  bright) 
NDVIb  values.  Thus  NDVIb  using  the  blue  and  NIR  bands  is  more  effective  and  offers 
more  resistance  to  the  areas  having  considerable  shadows  as  compared  with  NDVI  using 
the  red  and  NIR  bands.  One  of  the  primary  reasons  for  this  improvement  could  be 
because  the  blue  color  is  more  ambient  (due  to  sky)  and  hence  the  areas  with  shadows 
have  higher  blue  illumination  as  compared  to  red  illumination. 

As  shown  in  Figures  A. 2,  A. 3,  A.4,  A. 6,  and  A. 5  the  vegetation  indices  are  easy 
and  effective  indicators  that  can  be  used  to  analyze  the  presence  of  green  vegetation  in 
the  target  being  observed.  This  ability  is  exploited  in  the  current  discussion  to  classify  the 
image  frames  as  being  densely  vegetative  or  sparsely  vegetative.  The  data  are  needed  to 
be  classified  as  dense  vegetation  or  sparse  vegetation  for  the  modeling  purposes  to 
analyze  the  effect  of  vegetation  on  the  RX  detection  statistics  and  the  distribution  of  F 
distribution  that  is  used  to  model  them. 
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Red  Band 


NDVI  Image  -  Frame  21 


Blue  Band 


NDVIg  Image  •  Frame  21 


Dirt  road 

shadow 


Vegetation  in 


Figure  A. 6.  Red,  Blue,  and  NIR  Bands  with  NDVI  and  NDVIb  Images.  The  NDVI  and 
NDVIb  Value  Histograms  are  Also  Shown 


APPENDIX  B. 

THE  EM  ALGORITHM 
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The  EM  (Expectation-Maximization)  algorithm  is  a  general  algorithm  for  the 
maximum  likelihood  estimation  [Trees,  2007].  It  is  best  employed  for  incomplete  data 
sets  or  for  data  sets  with  missing  values.  The  tenn  "EM"  was  coined  by  Dempster  et  ah, 
in  a  paper  in  1976.  In  this  thesis,  the  EM  algorithm  is  used  for  parameter  estimation  to 
model  the  RX  test  statistics  in  the  form  of  a  modified  F  distribution. 

The  EM  algorithm  estimates  the  parameters  of  the  underlying  distribution  by 
finding  the  maximum  likelihood  function  /(0  |  x)  for  the  parameter/parameters  0  given 
the  observed  samples  ' x'  in  an  iterative  fashion  starting  from  an  initial  guess.  Each 
iteration  consists  of  the  following  two  steps  [Ganju,  2006;  Bilmes,  1998]: 

STEP  1:  Expectation  Step  -  This  step  finds  the  distribution  of  the  complete  data  with 
respect  to  the  known  values  of  the  observed  data  and  current  estimate  of  the  parameters. 
The  step  involves  the  formulation  of  the  estimation  of  the  likelihood  (or  log-likelihood) 
function  of  the  complete  data  given  the  observed  samples  and  the  current  fit  of 
parameters. 

STEP  2:  Maximization  Step  -  Effectively,  this  step  maximizes  the  expectation  computed 
in  the  first  step;  i.e.,  this  step  re-estimates  the  anticipated  likelihood  parameters  under  the 
assumption  that  the  distribution  found  in  the  first  step  is  correct. 

Both  of  the  above  steps  are  carried  out  iteratively  until  the  tenninating  condition 
is  reached.  The  tenninating  condition  can  either  be  the  maximum  number  of  iterations 
reached  or  else  no  significant  improvement  over  the  previous  values  of  the  likelihood.  It 
has  already  been  proved  in  literature  that  with  each  successive  iteration,  the  likelihood 
estimate  either  improves  or  remains  unchanged  (attain  local  maximum)  [McLachlan  and 
Krishnan,  1997]. 

The  mathematical  fonnulation  for  the  EM  algorithm  can  be  explored  in  [Ganju, 
2006;  McLachlan  and  Krishnan,  1997]. 

B.l.  Formulation  of  update  the  equation  for  the  RX  statistic 

The  above-said  EM  algorithm  is  applied  on  the  RX  statistics  in  the  following 
manner.  Post  non-max  RX  statistics  can  be  written  as 
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K 


(B-l) 


where 


A(r)  = 


5  ^W)<nr> 


(B-2) 


andF^(r)  is  the  CDF  of  f  (r)  and  K  is  given  as: 


K  =  ^f\{r)e  ^(1  F' {r))dr  =  e  N^f\(r)e 


m(r). 


(B.3) 


Uisng  the  substitution,  NF\  (r)  =  u  it  can  be  shown  that 


\-e~N 

K  = - ,N*  0 

N 


(B.4) 


Shown  above  is  a  general  result  applicable  to  all  types  of  distributions  under  non¬ 


max  suppression. 


Now  taking  the  natural  logarithm  of  Equation  (B.l) 


In  /  (r)  =  ( —  - 1  ]  In  r  -  In  B\  — -  n+V2  ln(l  +  x)- iV(l  -  iq  (r))  -  In  if  (B.5) 

2  )  l  v  2  2  1 J  l  2 


Taking  the  derivative  of  Equation  (B.6),  w.r.t.Vj,v2,  and  N,  and 

substituting,  B(x,  v)  =  — - - — ,  provide  the  update  equation  for  the  EM  algorithm  for 

E(x  +  y) 


parameter  estimation.  Taking  derivative  w.r.t.  to  Vj  to  obtain: 
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where  i//(x)  is  a  digamma  function  defined  as 

</(ln(T(x))) 


iK*)  = 


dx 


(B-6) 
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In  the  similar  manner,  the  derivative  of  Equation  (B.l),  w.r.t.  v2 ,  is  obtained  as 
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where 
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Also  taking  the  derivative  wrt  A  can  be  obtained  as: 
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The  derivative  of  the  beta  function  with  respect  to  v\  and  v2  is  defined  as 
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Equation  (B.6),  (B.7)  and  (B.8)  are  the  required  update  equations  corresponding  to  the 
three  parameters  to  be  estimated,  which  will  be  used  in  the  next  step. 

B.3.  Parameter  Estimation  from  update  equations 

Once  the  update  equations  are  obtained,  the  information  matrix  is  constructed. 
The  infonnation  matrix,  ‘/m,’  is  a  square  matrix  whose  dimensions  depend  on  the 

number  of  parameters  to  be  estimated.  For  the  current  estimation  problem,  ‘  I m  ’  is 
therefore  a  3x3  matrix.  The  information  matrix  for  the  current  case  is  approximated  as 
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Letvf  ,  vk ,  and  Nk  be  the  estimates  of  the  parameters  in  the  kth  iteration.  Then  the 
estimate  of  these  parameters  in  the  (k+  l/h  iteration/step  is  given  by 
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(B.9) 


where  X  is  the  scaling  factor. 

These  new  estimates  of  the  underlying  parameters,  ‘  vf+1  ‘  vk+1 and  ‘  Nk+1 ,’  are 
then  used  in  three  update  equations,  (B.6),  (B.7)  and  (B.8)  to  yied  a  new  information 
matrix,  Im .  Infonnation  matrix  is  used  in  Equation  (B.9)  to  obtain  a  new  set  of  parameters 

in  the  (k+2)th  iteration.  The  process  continues  until  the  parameters  converge  to  a  steady 
state  value. 


B.4.  EM  algorithm  -  Convergence  Properties 

Like  any  other  non-converging  optimization  problem,  it  is  possible  for  the 
parameters  to  converge  at  a  local  or  a  saddle  point  rather  than  converging  at  the  global 
minima.  This  depends  mostly  on  the  type  of  log-likelihood  function.  If  the  log-likelihood 
function  is  unimodal,  then  the  convergence  of  the  likelihood  function  and  the  parameters 
is  unique.  However,  in  cases  when  the  likelihood  function  is  multimodal,  the  likelihood 
function  and  parameters  may  converge  to  some  saddle  point. 

In  some  cases,  if  the  number  of  parameters  to  be  estimated  in  a  distribution  is 
large,  then  the  parameters  are  observed  to  undergo  periodic  oscillations  after  a  certain 
number  of  iterations  [McLachlan  and  Krishnan,  1997].  This  is  because  as  the  number  of 
parameters  increases,  the  likelihood  surface  tends  to  become  flat  and  thus  rather  than 
converging  to  a  steady  value,  the  parameters  tend  to  converge  to  a  range.  Under  this 
phenomenon,  the  EM  is  said  to  converge  to  a  circle  rather  than  a  single  point. 


99 


Another  important  factor  deciding  the  convergence  of  the  EM  algorithm  is  the 
initial  starting  point  induced  in  the  EM  algorithm.  It  has  been  reported  that  if  the  log  - 
likelihood  function  has  several  maxima,  minima,  or  stationary  points,  than  the 
convergence  of  the  EM  algorithm  to  the  right  point  depends  on  the  choice  of  the  starting 
point  [Wu,  1983];  i.e.,  if  the  starting  point  is  near  to  some  saddle  point  or  local  point,  then 
it  is  highly  possible  that  the  parameters  converge  to  that  saddle  point  or  local  point.  For 
the  present  estimation  problem,  the  initial  parameters  are  derived  using  methods  of 
moments  explained  in  detail  in  Appendix  C. 


APPENDIX  C. 

ESTIMATING  INITIAL  PARAMETERS  FOR  RX  DISTRIBUTION 
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Initial  parameters  play  an  important  role  for  parameter  estimation  using  EM 
approximation.  These  initial  parameters  must  be  carefully  chosen  so  that  the  parameters 
converge  to  the  right  values.  For  this  reason,  alternatives  have  been  searched  so  as  to  find 
the  best  way  to  provide  the  initial  parameters  to  the  EM  algorithm.  One  way  that  is 
adopted  here  is  based  on  Method  of  Momen  ts  [Pearson,  1902].  In  this  method,  moments 
for  the  observed  data  are  used.  If  there  are  ‘p  ’  parameters  to  estimate,  then  the  first  ‘p  ’ 
sample  moments  are  equated  to  the  actual  moments  of  the  distribution,  given  that  the 
actual  moments  are  functions  of  the  parameters  of  interest. 

For  the  central  F  distribution,  the  probability  density  function  is  defined  as 
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where  ‘  x  ’  is  the  random  variable,  ‘  ’  is  the  numerator  degrees  of  freedom,  and  ‘  v2  ’  is 

the  denominator  degrees  of  freedom. 

The  central  F  distribution  can  be  transformed  into  RX  statistics  as 
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(C.2) 


where  r  =  x— . 

v2 

The  RX  statistics,  V  needs  to  be  multiplied  by  the  factor  of  'k.'  Then  the  scale 
factor  'k'  takes  care  of  other  non  ideal  factors  as  discussed  in  section  4. 1 . 

For  the  present  case  the  distribution  under  concern  is  post  non-max  RX  detection, 
not  RX  detection  itself.  However,  because  moments  of  post  non-max  RX  detection 


102 


statistics  do  not  have  a  closed  form  expression,  this  method  cannot  be  used  to  estimate 
parameters  in  that  case.  But  ‘  Vj ‘  v2 and  ‘  k  ’  can  be  estimated  using  this  method  and 
promoted  as  initial  parameters  for  the  EM  algorithm  of  parameter  estimation. 

The  first  three  moments  about  the  mean  or  standardized  moments  (mean, 
variance,  and  skewness)  for  the  transformed  and  scaled  central  F  distribution  function  are 
defined  as  follows  [Johnson,  1995]: 
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Skewness  (S') 
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(2v1+v2-2)V(4v2-32) 
(v2  -6)^, 2  +VjV2  -2vj) 
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where 


M  =  Eir } » 


a2  =E{(r-ju)2}, 

S  =  E{(r  -  M)2}  /  a2 . 


The  value  of  ‘v2/v^  ’  and  'k'  are  the  scale  that  is  required  to  be  multiplied  by  the  RX 
detections  to  make  it  a  central  F  distribution  as  explained  above.  As  shown  from 
Equation  (C.5),  no  scale  is  included  for  the  skewness  calculation  because  it  is  a  ratio 
between  the  two  scaled  quantities. 

This  is  now  just  a  linear  equation  to  solve  for  three  unknowns  given  three 
equations  in  those  unknowns.  The  solution  is  derived  below  in  detail: 


From  C.3, 


v,  =k/u(v2  -2) 


(C.6) 
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Using  this  value  of  v\  and  putting  it  in  C.4  and  simplifying  gives: 


a2(v2-  4)  — 2/U 

Substituting  Equation  (C.7)  into  (C.6)  to  obtain 

2/T(u2-2) 

o-2(v2-4)-2//2 


(C.7) 


(C.8) 


Substituting  Equations  (C.7)  and  (C.8)  into  Equation  (C.5)  and  simplifying  to  get 


v2 


4//2  -8cr2  +6 Sjua 
S/ua  -2a2 


(C.9) 


Thus  from  the  observed  samples,  the  mean,  variance,  and  skewness  can  be  easily 
obtained,  which  can  then  be  used  in  equations  (C.9),  (C.8),  and  (C.7)  to  derive  the 
estimates  of  ‘  v2  ’,  ‘  v, ,’  and  ‘  k ,’  respectively. 

The  RX  detections  are  multiplied  by  the  scale  factor  ‘  k  ’  before  passing  them  into 
the  EM  algorithm  whereas  scale  factor  of 'v2/v1 '  is  dynamically  applied  inside  the  EM 
algorithm.  ‘  v,  ’  and  ‘  v2  ’  are  used  as  the  initial  parameter  values  for  the  numerator 
degrees  of  freedom  and  denominator  degrees  of  freedom,  respectively.  The  initial 
estimate  of  the  third  parameter  ‘A P  is  kept  equal  to  100  for  all  the  cases  of  parameter 
estimation.  In  the  future,  it  may  be  possible  to  include  'k'  as  part  of  EM  update  equations 
so  that  value  of  T  is  estimated  along  with  v, ,  v2 ,  and  N 


APPENDIX  D. 

TEST  STATISTCS  TO  MEASURE  THE  GOODNESS  OF  FIT 
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Once  a  set  of  model  parameters  is  estimated  for  a  given  set  of  observations,  a 
metric  to  define  the  goodness  of  fit  is  of  interest.  Various  statistical  test  schemes  are 
available  to  measure  the  goodness  of  fit  between  the  observed  and  estimated  values  of  the 
distribution.  Some  commonly  used  goodness  of  fit  tests  are  as  follows: 

D.l.  Cramer  Von  -  Mises  (CVM)  Test 

The  test  statistics  for  this  test  is  given  by  [Bain  and  Engelhardt,  1991]: 


CM  = 


12  n 
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+  1 

z=H- 


Fixi^e)- 


i  -  0.5 
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(D.2) 


where, 

n  =  Number  of  observations, 

F(xi:n;&)  =  CDF  of  the  ordered  observations,  given  0, 

6  =  Parameter  vector  of  the  given  distribution. 

Here,  an  approximate  size  ‘a’  test  of  Ho :  X~  F  is  to  reject  Hq  if  CM  >  CM /_  a. 

D.2.  Kolmogorov  -  Smirnov  (KS)  or  Kuiper  Test 

With  same  assumption  as  D.l.,  [Bain  and  Engelhardt,  1991],  the  statistics  ‘D’  is 
given  as: 


D+  =  max, 


,  D  =  max, 


i  - 1 


and  D  =  max(D+,D  ) 


V  =  D+  +  D 


The  statistic  given  by  5D’  is  known  as  the  “Kohnogorov-Smimov”  or  “KS” 
statistic,  and  the  statistic  given  by  ‘V  is  known  as  the  “Kuiper”  statistic.  Here  also,  the 
‘a’  test  of  Ho :  X~  F  is  to  reject  H()  if  KS  >  KSi _  „.. 

D.3.  Chi  -  Square  Test 

The  test  statistics  for  the  Chi  Square  test  are  given  by  [Bain  and  Engelhardt, 
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1991]: 
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X 


(D.l) 


where,  /2=  Test  Statistic, 
n  =  Sample  size, 

0,  =  Number  of  observation  falling  into  ith  cell,  and 
Ei=  Expected  number  of  observations  to  fall  in  the  ith  cell. 

Here,  Hypothesis  H0  :  X  ~  F  is  rejected  if/2  >  xta  >  where  a  is  the  given  confidence 
interval. 

The  chi-square  test  is  the  most  flexible  and  general  test.  The  main  limitation  of 
CVM  and  KS  tests  is  unavailability  of  critical  values  except  for  a  few  distributions  such 
as  exponential,  weibull,  and  Normal  required  for  these  tests,  though  they  can  be  derived 
for  other  distributions  also.  For  the  cases  in  which  the  distribution  is  completely 
specified,  these  tests  perform  well.  However,  if  the  parameters  are  estimated  from  the 
data,  then  the  power  of  these  tests  is  reduced.  This  is  because  new  critical  values  need  to 
be  formulated  and  these  must  be  obtained  for  the  specific  parametric  fonn  that  is  to  be 
tested.  For  the  chi-square  test,  this  does  not  create  any  problems  because  in  this  test  the 
number  of  degrees  of  freedom  is  adjusted  in  accordance  with  the  parameters  that  are 
estimated  from  the  data.  Because  of  this  flexibility,  this  test  is  used  for  testing  the 
modeling  results. 

The  degrees  of  freedom  for  the  chi-square  test  are  calculated  in  the  following 
manner.  First  the  samples  are  grouped  in  bins  such  that  each  bin  has  a  certain  number  of 
samples.  For  the  present  scenario,  the  samples  are  grouped  together  such  that  each  bin 
has  10  samples.  If  the  number  of  bins  is  ‘  nb  ’  and  ip'  parameters  are  estimated  from  the 
data,  then  the  degree  of  freedom,  ‘  v ,’  is  given  as 

v  =  nb  ~(/>  +  1) 
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The  test  statistics  are  calculated  for  a  particular  FoR  as  per  Equation  (D.l).  Once 
the  degrees  of  freedom  and  test  statistics  are  calculated,  the  threshold  is  found  for  the 
given  confidence  interval.  Since  the  test  statistics  follows  a  chi-square  distribution  with  v 
degrees  of  freedom,  hypothesis  H0  :  X  ~  F  is  rejected  if/2  >  /, 2_a ,  where  a  is  the  given 
confidence  interval. 
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